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properties can be modeled. The model is applied to the test case study of a DNA loop clamped 
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atom structures on the basis of the model, are discussed. 
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I. INTRODUCTION 

Formation of DNA loops is a common motif of protein- 
DNA interactions [JUlE Q . A segment of DNA forms 
a loop-like structure when either its ends get bound by 
the same protein molecule or a multi-protein complex, 
or when the segment gets wound around a large multi- 
protein aggregate, or when the segment connects two 
such aggregates. In bacterial genomes, DNA loops were 
shown to play important roles in gene regulation [2J,|2J; in 
eucaryotic genomes, DNA loops are a common structural 
element of the condensed protein-DNA media inside nu- 
clei 0, ■ Understanding of the structure and dynamics 
of DNA loops is thus vital for studying the organization 
and function of the genomes of living cells. 

The amount of experimental data on the DNA phys- 
ical properties and protein-DNA interactions both in 
vivo and in vitro has grown dramatically in recent 
years |5j . With the advent of modern experimental tech- 
niques, such as micromanipulation 0,0)0 and fast reso- 
nance energy transfer [i| , researchers were presented with 
unique opportunities to probe the properties of individual 
macromolecules. X-ray crystallography, NMR, and 3D 
electron cryomicroscopy [l(]| | provided numerous struc- 
tures of protein-DNA complexes with resolution up to a 
few angstroms, including such huge biomolecular aggre- 
gates as RNA polymerase an d nucleosome [P3 T13j. 
The ever growing volume of data provides theoretical 
modeling, which has generally been recognized as a vital 
complement of experimental studies, with an opportunity 
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to revise the existing models, and to build new improved 
models of biomolecules and biomolecular interactions. 

Several existing DNA models are based on the theory 
of elasticity El 13 E3- These models treat DNA 
as an elastic rod or ribbon, sometimes carrying an elec- 
tric charge. The geometrical, energetic, and dynamical 
properties of such ribbon can be studied at finite tem- 
perature using Monte-Carlo or Brownian Dynamics tech- 
niques employing a combined elastic/electrostatic energy 
functional [la, H3 • Such studies usually involve exten- 
sive data generation, for example, numerous Monte Carlo 
structural ensembles, and require significant investment 
of computational resources. Alternatively, one could re- 
sort to faster theoretical methods, such as statistical me- 
chanical analysis of the elastic energy functional 0, 0] 
or normal mode analysis of the dynamical properties of 
the elastic rod [20j] • A fast approach to studying the 
static properties of DNA loops - such as the loop en- 
ergy, structure, and topology - consists in solving the 
classical equations of elasticity [S , dated back to 
Kirchhoff [23] and derived on the basis of the same en- 
ergy functional. The equations can be solved with cither 
fixed boundary conditions for the ends of the loop or un- 
der a condition of a constant external force acting on the 
ends of the loop 0, IM El El IS E3 ■ 

In order to achieve a realistic description of the physi- 
cal properties of DNA, the classical elastic functional [2^, 
[SE3 has to be modified: (i) the modeled elastic ribbon 
has to be considered intrinsically twisted (and, possibly, 
intrinsically bent) in order to mimic the helicity of DNA, 
(ii) the ribbon has to carry electrostatic charge, (iii) be 
anisotropically flexible, i.e., different bending penalties 
should be imposed for bending in different directions, 

(iv) be deformable, e.g., through extension and/or shear, 

(v) have different flexibility at different points, in order 
to account for DNA sequence-specific properties, (vi) be 
subject to possible external forces, such as those from 
proteins or other DNA loops. The earlier works have pre- 
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sented many models including several of these properties, 
e.g., bending anisotropy and sequence-specificity [20| : 
extensibility, intrinsic twist/curvature, and electrostat- 
ics j2^| ; extensibility, shearability, and intrinsic twist [2^] ; 
intrinsic curvature and electrostatics |3l|; or intrinsic 
twist and forces due to self-contact pM l27| . 

Yet, to the best of our knowledge, a completely real- 
istic treatment, where the theory of elasticity would be 
modified as to include all of the listed DNA properties, 
has never been published. While for DNA segments of 
large length some of these properties can be disregarded 
or averaged out using a proper set of effective parame- 
ters, we feel that a proper model of DNA on the scale 
of several hundred base pairs - which is a typical size of 
DNA loops involved in protein-DNA interactions - must 
be detailed, including a proper description of all physical 
properties of real DNA. 

This work offers a step towards such a generalized elas- 
tic DNA model. The Kirchhoff equations of elasticity are 
derived in Sec. ITTl below for an intrinsically twisted (and 
possibly bent) elastic ribbon with anisotropic bending 
properties. The terms corresponding to external forces 
and torques are included and can also be used to ac- 
count for the electrostatic self-repulsion of the rod, as 
described in Sec. [V] All the parameters are considered 
to be functions of the ribbon arclength s, thus making 
the DNA model sequence-specific. Only the DNA de- 
formability is omitted from the derived equations - yet 
can be straightforwardly included in the problem, as dis- 
cussed in Sec. I VII The numerical algorithm for solving 
the modified Kirchhoff equations, based on the earlier 
work H3,|32, is presented fSec. IIII 5|) . 

The proposed model is used to predict and analyze the 
structure of the DNA loops clamped by the lac repressor, 
a celebrated E. coli protein, reviewed in Sec. IIII Al The 
system provides a typical biological application for the 
developed model and is used to extensively analyze the 
effect of bending anisotropy (Sec. IIV|) and electrostatic 
repulsion (Sec.[Vj in our model and to evaluate the corre- 
sponding model parameters. The DNA sequence-specific 
properties in terms of elastic moduli and intrinsic curva- 
ture, while included in the derived equations, were not 
used in the study of the lac repressor system. 

The further developments that would make the model 
truly universal and the model's scope of applicability are 
discussed in Sec. I VII Notably, it is shown how the elastic 
rod model can be combined with the all-atom model in 
multi-scale simulations of protein-DNA complexes or how 
all-atom DNA structures can be built on the basis of the 
coarse-grained model. 



II. BASIC THEORY OF ELASTICITY 

In this section we describe first the classical Kirchhoff 
theory of elasticity and then how this theory can be ap- 
plied to modeling DNA loops. 




FIG. 1: Parameterization of the elastic rod. (a) The centerline 
f(s) and the intrinsic local frame di, d,2, d$. (b) The principal 
normal n(s) = r/\r\ and the binormal b(s) — d$ x n form the 
natural local frame for the 3D curve r(s) |33J. 

A. Kirchhoff model of an elastic rod 

The classical theory of elasticity describes an elastic 
rod (ribbon) in terms of its centerline and cross sections 
(Fig.^a). The centerline forms a three-dimensional curve 
f(s) = (x(s), y(s), z(s)) parametrized by the arclength s. 
The cross sections are "stacked" along the centerline; a 
frame of three unit vectors d\(s), d,2(s), ^(s) uniquely 
defines the orientation of the cross section at each point s. 
The vectors d\ and d-2 lie in the plane of the cross section 
(for example, along the principal axes of inertia of the 
cross section), and the vector d 3 = di x d 2 is the normal 
to that plane [TtJ ■ If the elastic rod is inextensible then 
the tangent to the center line coincides with the normal 

f(s) = d 3 (s) (1) 

(the dot denotes the derivative with respect to s). 

The components of all the three vectors di can be 
expressed through three Euler angles </>(s), tp(s), 0(s), 
which define the rotation of the local coordinate frame 
(di, 1^2,^3) relative to the lab coordinate frame. Alter- 
natively, one can use four Euler parameters qi(s), q% (s), 
13 ( s ) 7 Q4 ( s ) i related to the Euler angles as 

qi = sin 6»/2 sin(>- ip)/2, 
q 2 = sin 0/2 sin (0-^0/2, 
q 3 = cos6>/2 sin(0 + V)/2, 
q4 = cos 6/2 cos {<f> + ip)/2, 

and subject to the constraint 

<?1 + 92 + ?3 2 + «4 = 1 (3) 

(see, e.g., The computations presented in this pa- 

per employ the Euler parameters in order to avoid the 
polar singularities inherent in the Euler angles. 

Following Kirchhoff 's analogy between the sequence of 
cross-sections of the elastic rod and a motion of a rigid 
body, the arclength s can be considered as a time-like 
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variable. Then the spatial angular velocity k of rotation 
of the local coordinate frame can be introduced: 

<?i =1 _ 3 =kxdi, k= {K lt K 3 , 0} . (4) 

The vector k is called the vector of strains 0, ■ Ge- 
ometrically, its components K\(s) and K2(s) are equal 
to the principal curvatures of the curve f(s), so that the 
total curvature equals 

K( S ) = y/Kl( S )+K*(s), (5) 

and the vectors of principal normal n(s) and binomial 
b(s) (Fig.HJb) are 

n{s) = K 2 /K dx - Kx/K d 2 , (6) 
b(s) = Kx/K dx + K 2 /K d 2 . (7) 

The third component of the vector of strains O(s) is the 
local twist of the elastic rod around its axis d 3 [2l|. All 
three components can be expressed via the Euler param- 
eters Qi-x-A [3 P-16]: 

K x = 2(q i q 1 + q 3 q 2 - q 2 q 3 - QlQi) , (8) 
K 2 = 2(-q 3 qx + <74<72 + qiq 3 - 92(74) , (9) 
O = 2(q 2 qx - qiq 2 + q±q 3 - <23<74) ■ (10) 

If the elastic rod is forced to adopt a shape differ- 
ent from that of its natural (relaxed) shape, then elastic 
forces N(s) and torques M(s) develop inside the rod: 

N(s) = Nxdx + N 2 d 2 + N 3 d 3 , (11) 
M(s) = Mxdx + M 2 d 2 + M 3 d 3 . (12) 

The co mponents N x and ^2 compose the shear force 
N s h = \J N'x + N'i ; the component N 3 is the force of ten- 
sion, if ./V3 > 0, or compression, if N 3 < 0, at the cross 
section at the point s. Mx and M 2 are the bending mo- 
ments, and A/3 is the twisting moment. In equilibrium, 
the elastic forces and torques are balanced at every point 
s by the body forces f(s) and torques g(s), acting upon 
the rod: 

fi + f=0, (13) 
M+<f+fxiV = 0. (14) 



The body forces and torques of the classical theory usu- 
ally result from gravity or from the weight of external 
bodies - as in the case of construction beams. In the 
case of DNA, such forces are mainly of electrostatic na- 
ture, as will be described below. 

The last equation required to build a self-contained 
theory of the elastic rod relates the elastic stress to the 
distortions of the rod. The classical approach stems from 
the Bernoulli-Euler theory of slender rods, which stipu- 
lates the elastic torques M(s) to be linearly dependent on 
the curvatures and twist of the inherently straight rod: 



M(s) = AxKxdx + A 2 K 2 d 2 + CVLd 3 . (15) 
The linear coefficients Ax and A 2 are called the bending 
rigidities of the elastic rod, and C is called the twisting 
rigidity. For a solid rod, the classical theory finds that 
Ax = EIx, A 2 — EI 2 , where E is the Young's modulus 
of the material of the rod, and 7i, I 2 are the principal 
moments of inertia of the rod's cross-section. The twist- 
ing rigidity C is proportional to the shear modulus fi of 
the material of the rod; in the simple case of a circular 
cross-section, C — 2/i7i = 2[il 2 . 

The equations QJ, ||3J|, 0, O, and (JT^J form the 
basis of the Kirchhoff theory of elastic rods. We simplify 
the equations, first, by making all the variables dimen- 
sionless: 



s = s/l, x = x/l, y = y/l, z = z/l , (16) 

K 1= lKx, K 2 = IK 2 , fl = m, (17) 

a = Ax/C, = A 2 /C, 7 = C/C=1, (18) 

N i= x- 3 = Nil 2 /C, M i=1 _ 3 = Mil/C, (19) 



where I is the length of the rod. Second, we express 
the derivatives <ji through K\, K 2 , O, and the Euler pa- 
rameters, using equations I|8I1U|I and the constraint J3J|, 
differentiated with respect to s. Third, we eliminate the 
variables Nx and N 2 using equations i|12[l and arrive at 
the following system of differential equations of 13-th or- 
der [z|: 



aKx = (2/3 - l)(K 2 n) - 0K 2 tl + (a - l)Kxtf + KxN 3 + Qg 2 - f 2 - g\ , 

f3K 2 = (1 - 2a){KxQ) + aKxti + ((3- l)K 2 tf + K 2 N 3 - Qgx + fx - h , 

Q = (a - p)KxK 2 ~ g 3 , 

iV 3 = -aKxKx - PK 2 K 2 - 00 - gxKx - g 2 K 2 - g 3 V - h , 



(20) 
(21) 
(22) 
(23) 
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The solutions to this system correspond to the equilib- 
rium conformations of the elastic rod. The 13 unknown 
functions - f(s), #4=1—4(5), N(s), and M(s) - de- 
scribe the geometry of the elastic rod and the distribu- 
tion of the stress and torques along the rod. The equa- 
tions can be solved for various combinations of initial and 
boundary conditions. The case considered in this paper 
will be the boundary value problem, when the equilib- 
rium solutions are sought for the elastic rod with fixed 
ends - that is, with known locations r(0), r(l) of its ends 
at s — and s = 1, and known orientation qi=i-n(0), 
gi=i-4(l) of the cross section at those ends. Such case 
would correspond, for example, to a DNA loop whose 
ends are bound to a protein. 

In general, the system (|2UI3U|I has multiple solutions 
for a given set of boundary conditions. The dimension- 
less elastic energy of each solution is computed - accord- 
ing to the Bernoulli-Eulcr approximation l|15|) - as the 
quadratic functional of the curvatures and the twist: 

f 1 faK? /3K$ n' 2 \ , 

The straight clastic rod becomes the zero-energy ground 
state for the functional (|31|l . If the interactions of the 
elastic rod with external bodies, expressed through the 
forces / and torques g, are not negligible then the energy 
functional will include additional terms due to those in- 
teractions. 




FIG. 2: (a) The elastic rod fitted into an all-atom structure 
of DNA. (b) A coordinate frame associated with a DNA base 
pair, according to |35l| . 



B. The elastic rod model for DNA 

Elastic rod theory is a natural choice for building a 
model of DNA - a long linear polymer. The center- 
line of the rod follows the axis of the DNA helix and 
Watson-Crick base pairs form cross-sections of the DNA 
"rod" (Fig. [2Ja) - A coordinate frame can be associated 
with each base pair according to a general convention |35l ] 
(Fig- 121b). For a known all-atom structure of a DNA loop, 
an elastic rod (ribbon) can be fitted into the loop using 
those coordinate frames. Conversely, a known elastic rod 
model of the DNA loop can be used directly to build an 
all-atom structure of the loop (see App. [Q. Finally, if 
the all-atom structure is known only for the base pairs at 
the ends of the loop (as in the case of the lac repressor - 
cf Sec. llllf) then the coordinate frames (di, ^2,^3) can be 
associated with those base pairs and provide boundary 
conditions for Eqs. (|20I3U|I . 

However, several modifications of the classical theory 
are necessary in order to describe certain essential prop- 
erties of DNA. 

First of all, the relaxed shape of DNA is a helix, which 
is described by a tightly wound ribbon rather than a 
straight untwisted rod treated by our equations so far. 
The helix has an average twist of 10°/A so that one he- 
lical turn takes about 36 A. This is much smaller than 
the persistent length of DNA bending (500 A) or twisting 
(750 A) miii, so even a relatively straight segment of 
DNA is highly twisted. We introduce the intrinsic twist 
uj°(s) of the elastic rod as a parameter in our model. It 
is considered to be a function of arclength s, because the 
twist of real DNA varies between different sequences [3^ . 

Second, certain DNA sequences are also known to form 
intrinsically curved rather than straight helices [3|j- In 
terms of our theory that means that the curvature of 
the relaxed rod may be different from zero in certain 
sections of the rod. The intrinsic curvatures n\ 2 (s) are 
introduced in our model similarly to the intrinsic twist - 
as functions of arclength s determined by the sequence 
of the DNA piece in consideration. 

The intrinsic twist and curvatures result in the modi- 
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fled Bernoulli-Euler equation l|15fl : 

M(s) = Amidi + A 2 n 2 d 2 + CW 3 , (32) 

where 

K 1 , 2 (s)=K i {s)-K°(s), u(s)=tl(s)-uj (s). (33) 

Now, the elastic torques are proportional to the 
changes in the curvatures and twist from their intrin- 
sic values, rather than to their total values. Note, how- 
ever, that the "geometrical" eqs. (@J - and consequently, 
eqs. (I24I27|) - still contain the full values of the twist and 
curvature, so that switching from eq. l(T5|l to eq. does 
not simply result in replacing Ki and f2 with and lo 
throughout the system (|20I3Q(I . 

Another important structural property of real DNA, 
which we want to encapsulate in our theory, is the 
sequence-dependence of the DNA flexibility, i.e., that 

I 

(aKi) 
(fa) 

(juj) 
N 3 

9i 
92 
<h 
94 

X 

il 

z 



and the new energy of the elastic rod is computed as: 




The system (|36I46[) describes the elastic rod model of 
DNA in the most general terms. Not all of the options 
provided by such model will be explored in the present 
work; most times the equations will be simplified in one 
way or another. The unexplored possibilities and situa- 
tions when they might become essential will be discussed 
in Sec. EH 

To conclude this section, let us observe two immediate 
results of switching from the classical equations (|20I3U|) 
to the more realistic equations I|36I46I1 ■ First, the high 
intrinsic twist u°(s) results in strongly oscillatory be- 



certain sequences of DNA are more rigid than oth- 
ers 0, 0, |2(} ■ Accordingly, the bending and twisting 
rigidities A\, A 2 , and C in eq. ll3*2^l become functions 
of arclength s rather than constants. The exact shape 
of these functions depends on the sequence of the stud- 
ied DNA piece (cf App. [DJ). The dimensionless bending 
rigidities a, (3, and 7 (as well as the dimensionless forces 
and torques) now become scaled not by C, but by an 
arbitrary chosen value of the twisting modulus C (for 
example, the average twisting rigidity of DNA): 

a = A 1 /C a , f3 = A 2 /C , j = C/C , (34) 
iV i= i- 3 = Nil 2 /C , M j= i_ 3 = Mil/Co (35) 

(c/eq. nana). 

With the above changes, the new 'grand' system of 
differential equations becomes: 



(36) 
(37) 
(38) 
(39) 
(40) 

(41) 

(42) 

(43) 

(44) 
(45) 
(46) 



havior of solutions to the system I|36I46[) . Due to the 
non-linear character of the system, the oscillatory com- 
ponent can not be separated from the rest of the solution. 
Second, a DNA loop that contains intrinsically bent seg- 
ments (those inside which kJ 2 7^ 0) may not be uniformly 
twisted - as it would be in the classic case - even if the 
loop were isotropically flexible (a = (3) and no external 
torques were acting upon the loop (g = 0). Whereas 
the classical theory necessitates that in such case Cl = 
(c/eq. (|22J0j ^e right-hand part of the updated eq. 
is non-zero if K\^ 2 ^ «i 2< 



= (2/3rt 2 ^) - ilK 2 u) - (3k 2 (1 + anxti 2 - jK^il + K 1 N 3 + Vlg 2 - f 2 - g\ , 

= -(2a«if2) + (jKiu) + a/tiQ + f3n 2 n 2 - ^K 2 loVL + K 2 N 3 - Qgi + ft- g 2 , 

= an\K 2 — (5Kin 2 — g 3 , 

= -(otKxjKx - (f3n 2 )K 2 - (7w)fl - giKi - g 2 K 2 - g 3 Cl - f 3 , 

= ^194-^293+^2), 

= 2^ K!q 3 + K 2 q4: - ttqi) , 

= ^(-K iq2 + K 2qi +n q4 ) , 

= l;(- K m- K 2 q 2 -nq 3 ) , 

= 2(<7i<7 3 + q 2 q 4 ) , 

= 2(g 2 93 - 9194) , 

2 2,2,2 

= -9i - 9 2 + 9 3 + 9 4 . 

I 



6 



III. ELASTIC ROD SOLUTIONS FOR THE DNA 
LOOP CLAMPED BY THE LAC REPRESSOR 

In this section we first describe the test case system 
for our theory: the complex of the lac repressor protein 
with DNA. Then this protein-DNA system is used to il- 
lustrate the numeric algorithm of solving the equations 
of elasticity Finally, different solutions for the DNA loop 
clamped by the lac repressor are presented. 



A. The DNA complex with the lac repressor 

For a test case study, the developed theory was used 
to build a model of the DNA loop induced in the E. coli 
genome by the lac repressor protein. Lac repressor func- 
tions as a switch that shuts down the lactose (lac) operon 
- a famous set of E. coli genes, the studies of which 
laid one of the cornerstones of modern molecular biol- 
ogy Hi HE E2- The g enes code for proteins that are 
responsible for lactose digestion by the bacterium; they 
are shut down by the lac repressor when lactose is not 
present in the environment. When lactose is present, a 
molecule of it binds inside the lac repressor and deacti- 
vates the protein, thereby inducing the expression of the 
lac operon (Fig. |3]a-b). 

The lac repressor consists of two DNA-binding 
"hands" , as it can be seen in the crystal structure of the 
protein [4y| (Fig. 0c). Each "hand" recognizes a specific 
21-bp long sequence of DNA, called the operator site. 
The lac repressor binds to two operator sequences and 
causes the DNA connecting those sequences to fold into 
a loop. There are three operator sequences in the E. coli 
genome: Oi, O2, and O3 H3; the repressor binds to Oi 
and either O2 or O3 , so that the resulting DNA loop can 
have two possible lengths: 385 bp (O1-O2) or 76 bp (Oi- 
O3) (Fig.|31b). All three operator sites are necessary for 
the maximum repression of the lac operon |43l |. While 
the long O1-O2 loop (385 bp) is the easier to form, the 
short 1-0,3 loop (76 bp) contains the lac operon pro- 
moter [8(J so that folding this 76 bp region into a loop is 
certain to disrupt the expression of the lac operon. 

It would hardly be possible to crystallize the DNA 
loops induced by the lac repressor - merely because of 
their size - and thus the crystal structure |4(| of the lac 
repressor was obtained with only two disjoint operator 
DNA segments bound to the protein [8l| • The equations 
of elasticity, discussed above, can be used to build elas- 
tic rod structures of the missing loops, connecting the 
two DNA segments. Such structure would allow to fur- 
ther the study of the lac repressor-DNA interactions in 
several ways. First, the force of the protein-DNA inter- 
actions computed after solving the equations of elasticity 
can be used in modeling the changes in the structure of 
the lac repressor that likely occur under the stress of the 
bent DNA loop. Second, the elastic rod structure of the 
loop may serve as a scaffold on which to build all-atom 
structures of its parts of interest - such as the binding 



sites of other proteins, important for the lac operon ex- 
pression, e.g., RNA polymerase and CAP - or, indeed, 
of the whole loop (cf App. [FJ). All-atom simulations of 
these sites, either alone or with the proteins docked, may 
provide interesting keys to the interactions of the regula- 
tory proteins with the bent DNA loop and therefore, to 
the mechanism of the lac operon repression. Third, one 
can predict how changing the DNA sequence in the loop 
would influence the interactions between the lac repres- 
sor and the DNA - by changing the sequence-dependent 
DNA flexibility in our model and observing the resulting 
changes in the structure and energy of the DNA loop. 
Finally, the lac repressor system can be used to tune the 
elastic model of DNA per se, in terms of parameters and 
complexity level, by comparing the predictions resulting 
from our model with the experimental data. 

These questions will be further discussed in Sec. IVII 
while the following sections will detail our study of the 
lac repressor system. 



B. Solving equations of elasticity for the 
76 bp-long promoter loop O1-O3 

The crystal structure [4(| of the lac repressor-DNA 
complex provides the boundary conditions for the equa- 
tions of elasticity I|36I46|1 . The terminal [83] base pairs 
of the protein-bound DNA segments are interpreted as 
the cross-sections of the loop at the beginning and at the 
end, and orthogonal frames are fitted to those base pairs, 
as illustrated in Fig. [21b. The positions of the centers of 
those frames and their orientations relative to the lab co- 
ordinate system (LCS) provide 14 boundary conditions: 
r(0), r(l), qi=\-i{Q), qi = \-^(l) for equations (|36I46|) . In 
order to match the 13th order of the system, a boundary 
condition for one of the q^s is dropped; it will be au- 
tomatically satisfied because the identity © is included 
into the equations. 

The iterative continuation algorithm used for solv- 
ing the BVP is the same as that used in our previous 
work |32| (with some modification when the electrostatic 
self-repulsion is included into the equations, as described 
in Sec.|VJ. The solution to the problem is constructed in 
a series of iteration cycles. The cycles start with a set of 
parameters and boundary conditions for which an exact 
solution is known rather than the desired ones. Grad- 
ually, the parameters and the boundary conditions are 
changed towards the desired ones. Usually, only some of 
the parameters are being changed during each specific it- 
eration cycle: for example, only r(l) or only a/ (3. During 
the cycle, the parameters evolve towards the desired val- 
ues in a number of iteration steps; the number of steps is 
chosen depending on the sensitivity of the problem to the 
modified parameters. At each step, the solution found on 
the previous step is used as an initial guess; with a proper 
choice of the iteration step, the two consecutive solutions 
are close to each other, which guarantees the convergence 
of the numerical BVP solver. For the latter, a classical 
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FIG. 3: (a) The expressed lac operon. The biomolecules involved are: (1) - lac repressor (shown deactivated by 4 bound lactose 
molecules), (2) - the DNA of E. coll, (3) - RNA polymerase (shown first bound to the promoter and then transcribing the lac 
operon genes), (4) - mRNA (shown as being transcribed by the RNA polymerase). The flag shows the base pair of the 

DNA where the genes of the lac operon begin. The operator sites, marked as O1-3, are shown as shaded rectangles; the position 
of each operator's central base pair is shown in brackets, (b) The repressed lac operon. The lac repressor is shown binding 
two operator sites - either Oi and O3 or Oi and O2 - and the RNA polymerase - released from the operon. The end-to-end 
length of the DNA loop formed in each case is indicated, (c) The crystal structure of the lac repressor |40|l . The DNA loop, 
missing in the crystal structure and shown here in light color, corresponds to an all-atom model fitted to one of the two elastic 
rod structures predicted for the 76 bp loop (see [3^| and Sec. 1111 B} . Small pieces of the crystal structure are omitted from the 
figure for clarity, (d) Same as (c) for the 385 bp loop; the all-atom DNA model is fitted to one of the four possible structures 
of the loop (see Sec. MI CI and Fig. 0. 
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(a) (b) 




FIG. 4: Evolution of the elastic rod structure during the solu- 
tion of the BVP for the short loop, (a) The initial solution: a 
closed circular loop, (b) The solution after the first iteration 
cycle. (c,d) The solutions after the second iteration cycle, 
for the clockwise (c) and counterclockwise (d) rotation of the 
s = 1 end. (e,f) The solutions after the third iteration cy- 
cle; the previous solutions are shown in light color; the views 
from the top include the forces that the DNA loop exerts on 
the DNA segments bound to the lac repressor. The protein- 
bound DNA segments from the lac repressor crystal structure 
are shown for reference only, as they played no role during the 
iteration cycles except for providing the boundary conditions. 



software, COLNEW 0, is employed. COLNEW uses a 
damped quasi-Newton method to construct the solution 




FIG. 5: Extraneous solutions to the BVP obtained for a dif- 
ferent orientation ip of the initial circular loop. The initial 
loop from Fig. 0]a is shown in panel (a) in light color. 



to the problem as a set of collocating splines. 

The exact solution, from which the iteration cycles 
started, was chosen to be a circular closed (r(0) = 
elastic loop with zero intrinsic curvature k° 2 , constant 
intrinsic twist ui° — 34.6 deg/bp (the average value for 
the classical B-form DNA |3(), constant elastic moduli 
a = j3 = |, and no electrostatic charge (<2 DNA = 0) [83| . 
This solution is shown in Fig. 0|a; the explicit form of 
the solution is given in App. ^] The loop started (and 
ended) at the center of the terminal base pair of one of 
the protein-bound DNA segments. The coordinate frame 
associated with that base pair (or the loop cross-section 
at s = 0) was chosen as the LCS. The orientation of the 
plane of the loop was determined by a single parameter 
ip - the angle between the plane of the loop and the 
cc-axis of the LCS. 

In the first iteration cycle, the value of r(l) was 
changed, so that the s — 1 end of the loop moved by 
45 A to its presumed location at the beginning of the 
second DNA segment (Fig.0]b). 

In the second iteration cycle, the cross-section of the 
elastic rod at the s — 1 end was rotated so as to satisfy 
the boundary conditions for (7^=1-4(1) (Fig. 0]c,d). The 
rotation consisted in simultaneously turning the normal 
d% of the cross-section to coincide with the normal to the 
terminal base pair and rotating the cross-section around 
the normal in order to align the vectors d\ and di with 
the axes of the base pair. 

Depending on the direction of the rotation, two differ- 
ent solutions to the problem arise. Rotating the s = 1 
end clockwise results in the solution shown in Fig. 0|c. 
Rotating the end counterclockwise results in the solution 
shown in Fig. 0]d. The former solution is underwound 
by u> = — 1.4 per bp on the average and the latter solu- 
tion is overwound by lu — 1.6 per bp on the average - 
hence, they will be hereafter named "U" and "O". The 
two solutions may be transformed into each other dur- 
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FIG. 6: Distribution of curvature, twist, and energy in the elastic rod BVP solutions (U - left panel, O - right panel) for the 
76 bp DNA loop. In each panel, left column: the isotropic solution with a = /3 = 2/3; center column: the anisotropic solution 
with q = 4/15, /3 = 16/15; right column: the anisotropic solution with electrostatic interactions, for an ionic strength of 
10 mM. 



ing an additional iteration cycle: namely, a rotation of 
the s = 1 end clockwise by 2 ir around its normal trans- 
forms the U solution into the O solution, and vice versa. 
Then, a rotation of the s = 1 end by another turn re- 
stores the U solution, so that a continuous rotation of 
the s = 1 end results in switching between the two so- 
lutions. That happens because of a self-crossing of the 
DNA loop, which is not yet prevented in the model at this 
point. Topologically, rotating the end increases the link- 
ing number |l4l |45| of the loop by 2 7r and a self-crossing 
reduces the linking by the same amount; therefore, two 
full turns are exactly compensated by one self-crossing, 
and the original solution gets restored after two turns. 

In the third iteration cycle, the bending moduli a 
and j3 (so far, equal to each other) were changed from 
0.5 to 2/3 - which is the ratio between DNA bending 
and twisting moduli that most current experiments agree 
upon |84j . Such increase in the bending rigidity slightly 
changed the geometry of the U and O solutions (Fig.^Jc- 
f) and increased the unwinding/overwinding to —1.6 °/bp 
and 2.0°/bp, respectively. The change in ui has a clear 
topological implication: the deformation of the looped 
DNA is distributed between the writhe (bending) of the 
centerline and the unwinding/overwinding of the DNA 
helix. When the bending becomes energetically more 
costly, the centerline of the loop straightens up (on the 
average) and the deformation shifts towards more change 



in twist. 

Notably, two more solutions may result from the iter- 
ation procedure, depending on the orientation t[) of the 
initial simplified circular loop (Fig. |5J). However, for the 
76 bp loop these solutions are not acceptable, because the 
centerlincs of the corresponding DNA loops would have 
to run right through the structure of the lac repressor (cf 
Fig.EJc-d and Fig. 0c). 

Therefore, only the two former solutions to the prob- 
lem are acceptable in the case of the short loop. The 
shapes of the solutions obtained after the third iteration 
cycle become our first-approximation answer to what the 
structure of the DNA loop created by the lac repressor 
must be. The solutions are portrayed in Fig. 0]e-f, and 
the profiles of their curvature, twist, and elastic energy 
density are shown in Fig. (left columns of the two pan- 
els). 

The U solution forms an almost planar loop, its plane 
being roughly perpendicular to the protein-bound DNA 
segments (Fig. 0]e) . The shape of the loop resembles a 
semicircle on two relatively straight segments connected 
by short curved sections to the lac repressor-bound DNA. 
Accordingly, the curvature of the loop is highest in the 
middle and at the ends, the strongest bend being around 
6 deg/bp, and drops in between (Fig. EJ). The average 
curvature of this loop is 3.7 deg/bp. The unwinding is 
constant, by virtue of a = /3 - for the same reason the 
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energy density profile simply follows the curvature plot. 
The total energy of the loop is 33.0 kT, of which 26.8 kT 
are accounted for by bending, and 6.2 kT by twisting. 
The stress of the loop pushes the ends of the protein- 
bound DNA segments (and consequently, the lac repres- 
sor headgroups) away from each other with a force of 
about 10.5 pN (Fig. He). 

The O solution leaves and enters the protein-bound 
DNA segments in almost straight lines, connected by a 
semicircular coil of about the same curvature as that 
of the U solution, not, however, confined to any plane 
(Fig. EJf ) . The average curvature equals 3.6 deg/bp. The 
energy of this loop is higher: 38.2 kT, distributed between 
bending and twisting as 28.5 kT and 9.7kT, respectively. 
The forces of the loop interaction with the protein-bound 
DNA segments equal 9.2 pN and are pulling the ends of 
the segments past each other (Fig. |Ijf). 

Since the energy of the U loop is 5 kT lower than that 
of the O loop, one could conclude that this form of the 
loop should be dominant under conditions of thermody- 
namic equilibrium. Yet, both energies are too high: the 
estimate of the energy of the 76 bp loop from the ex- 
perimental data [46| is approximately 20 kT at high salt 
concentration (see App. [5J|. Therefore, one cannot at 
this point draw any conclusion as to which loop struc- 
ture prevails, and further improvements to the model are 
needed, such as those described in sections TlVIVI 

C. Solutions for the 385 bp-long O1-O2 loop 

Using the same algorithm, the BVP was solved for the 
385 bp loop. Similarly, four solutions were obtained (cf 
Figs. IHe-f, (SJc-d). With the longer loop, the previously 
inacceptable solutions are running around the lac repres- 
sor rather than through it - and therefore, are acceptable. 
All four solutions (denoted as U, U', O, O') are shown in 
Fig. [3 The solutions U and U' are underwound, O and 
O' are overwound. The geometric and energetic param- 
eters of the four solutions are shown in Table [I] 

It is, of course, not surprising that the elastic energy 
and the average curvature and twist of the longer loops 
are smaller than those of the shorter loops of the same 
topology. The curvature and the twist decrease because 
the same amount of topological change (linking number) 
has to be distributed over larger length - and the energy 



Solu- 


< Ul > 


< K > 




U 


U K /U 


U u /U 


tion 


(dcg/bp) 


(dcg/bp) 


(dcg/bp) 


(kT) 






U 


-0.24 


0.73 


1.31 


6.2 


0.88 


0.12 


O 


0.40 


0.80 


1.49 


8.7 


0.76 


0.24 


u' 


-0.33 


1.21 


1.59 


14.4 


0.90 


0.10 


O' 


0.42 


1.23 


1.58 


15.5 


0.86 


0.14 



TABLE I: Geometric and energetic properties of the four so- 
lutions of the BVP problem for the 385 bp-long DNA loop, 
in the case a = f3 = 2/3. 




FIG. 7: Four solutions of the BVP problem for the 385 bp- 
long DNA loop. Underwound solutions are marked by the 
letter 'U', overwound ones, by 'O'. 



density decreases as the square of the curvature/ twist, 
so the integral energy is roughly inversely proportional 
to the length of the loop. More interestingly, it is the U 
loop again that has the lowest energy and, on the first 
look, should be predominant under thermodynamic equi- 
librium conditions. And the formerly extraneous U' and 
O' solutions clearly have such high energies that they 
should practically not be represented in the thermody- 
namical ensemble of the loop structures and could be 
safely discounted for the 385 bp loop as well. 

Yet again, the conclusions, based on the obtained en- 
ergy values, shall be postponed until the present elastic 
rod model is further refined. 
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IV. EFFECTS OF ANISOTROPIC FLEXIBILITY 

As the first step towards refining our model, a closer 
look is taken into the DNA bending moduli A\ and 
Ai (or,a and 3). The ma jority of the earlier treat- 
ments [IM, [T^, |2a E3, Ha |32t|43| used the approx- 
imation of isotropic flexibility: a — (3, which holds for 
an elastic rod with a cross-section that has a rotational 
symmetry of the 4-th order. Such approximation sim- 
plifies the problem, for example, resulting in a uniform 
distribution of the extra twist (unwinding or overwind- 
ing) co along the DNA in the absence of the external 
twisting moment g^: uj(s) — const (c/ eqs. (|22|) . (j2HJ>)- 
However, the DNA cross-section is rotationally asymmet- 
ric (see Fig. 0b). Therefore the DNA is treated here as 
an anisotropically flexible rod - one with a ^ (3. Hav- 
ing started with the isotropic model, we consider in this 
Section the effect of anisotropic flexibility on the confor- 
mation and energy of the elastic rod. 



A. Anisotropic moduli of DNA bending 

The bending and twisting moduli (or persistence 
lengths) A and C of DNA has been measured in many ex- 
periments 0, HE • Most experiments currently agree 
on the ratio A/C = 2/3. However, this value is obtained 
by interpreting the experimental data in terms of isotrop- 
ically bendable DNA, i.e., idealized DNA with a circu- 
lar cross-section. However, the DNA structure clearly 
shows that at the atomic level the DNA bending should 
be anisotropic: for example, bending of DNA towards 
one of its grooves should clearly take less energy than 
bending over its backbone (Fig. [5J ■ Here we model the 
anisotropic rigidity by using two bending moduli A\ and 
A2 for bending in the direction of the grooves or in the 
direction of the backbone, respectively. (Other approxi- 
mations are also conceivable [TtI l49j l. 

How can the experimentally estimated effective bend- 
ing modulus A be related to the two bending moduli? 
To obtain a correct answer one should remember that 
the discussed bending takes place in a tightly twisted 
rod, i.e., one period of the intrinsic twist of the rod is 
much smaller than the typical radius of curvature. In a 
rod without such intrinsic twist, the thermal fluctuations 
of bending in two principal directions (that is, of bending 
around d\ and d,2 - see Fig. 0b) are independent, and a 
well-known formula (see, e.g., |49l liSOjl is obtained from 
the principle of energy equidistribution: 



1 

A 



2 Ui 



1 



(48) 



However, when the anisotropic rod is tightly twisted, 
the thermal fluctuations will inevitably cause bending in 
both principal directions. One may consider, for exam- 
ple, the bending a long screw: both the groove and the 
ridge of its thread will in turn be facing the direction of 



the bend. Since unwinding involves energetic penalty as 
well, the rod can not unwind so as to face the bend only 
with the "soft side". Accordingly, the effective bending 
modulus A will depend on both A\ and A?,. In the rod 
with no intrinsic curvature, the first approximation yields 
the following simple relation: 



A = 



Ai+A-2 



(49) 



(see Add. IT} . 

This single equation is however not sufficient to de- 
rive both a and (3 from the A/C ratio as the ratio of 
A1/A2 = a/ (3 remains unknown. In |4!j, the value of 
a 1/3 — 1/4 was suggested as both being close to exper- 
imental data and reproducing well the DNA persistence 
length in Monte Carlo simulations. Other estimates re- 
sult from comparison of the oscillations of roll and tilt 
- the angles of DNA bending in the two principal di- 
rections |38l l5lj | - which are directly related to a and 
(3. Moreover, there is always an uncertainty due to the 
dependence of both the a/ (3 and the A/C ratio on the 
DNA sequence - and for a specific short DNA loop, such 
as the one studied here, they may differ from the average 
values measured for a long DNA with variable sequence. 

For these reasons, we decided to study the effect of 
bending anisotropy on the structure and energy of the 
lac repressor loops in a broad range of parameters (i = 
a//3 and A/C ~ (a + /3)/2, and to pick the loops with 
A/C = 2/3 and fi = 1/4 for a detailed structural and 
energetic analysis. 



B. Changes to the lac repressor loops due to 
bending anisotropy in the case of /1 = 1/4 

Let us first consider the structure of the U and O loops 
in the specific case fi = 1/4, for it will be easier then to 
understand the general picture afterwards. To obtain the 
new loops, another iteration cycle was performed, dur- 
ing which the previously generated isotropic structures 
with a — (3 — 2/3 were transformed by simultaneously 
changing the bending moduli a and (3 towards the desired 
values of 4/15 and 16/15, respectively. 

The structures of the loops are shown in Fig. |SJ As 
one can see, the U loop did not change much, the root- 
mean square deviation (r.m.s.d.) [85| from the isotropic 
solution being only 1.1 h. In contrast, the O loop changed 
by 4.6 h and bent over itself at the point of near self- 
contact even more than the "soft" isotropic loop with 
a = 0= 1/2 (cf. Fig.Hf). 

The energies of both loops changed significantly, get- 
ting reduced by about one-third. The energy of the U 
loop dropped from 33.0 kT to 23.3 kT, distributed be- 
tween the bending and twisting as 18.9 kT and 4.4 kT, 
respectively. The energy of the O loop dropped from 
38.2 kT to 26.5 kT, distributed between the bending and 
twisting as 22.2 kT and 4.3 kT, respectively. 
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To understand the considerable energy change, one 
should consider the structure of the loops on the small 
scale. The distributions of curvature and twist in the 
loops are shown in Fig. (central columns). As one can 
see, the previously smooth profiles acquired a seesaw pat- 
tern. This happens because the elastic rod - now better 
called elastic ribbon - twists around the centerline with 
a high frequency due to its high intrinsic twist - and 
therefore the vectors d\ and d 2 get in turn aligned with 
the main bending direction (the principal normal ft, see 
Fig-D- In DNA terms, the loop is facing the main bend- 
ing direction with the grooves and the backbone, in turn 
(see Fig.|lJ. 

When the vector d 2 is aligned with ft, all the bending 
occurs towards the grooves, so that = \K\ and K 2 = 
0, and the local energy density equals dU K = aKf/2. 
Then, after a half-period of the intrinsic twist, d\ gets 
aligned with ft and all the bending occurs towards the 
backbone so that K 1 = 0, \K 2 \ = \K\, and dU K = /3K$/2. 
Since a < (3, the latter orientation results in higher en- 
ergetic penalty and unbending moments than the former 
one. Therefore, the sections of the rod facing the main 
bending direction with d\ straighten up and those fac- 
ing it with d 2 become more strongly bent. The resulting 
oscillations of the curvature are seen in Fig. O The struc- 
ture of the rod becomes an intermediate between that of 
a smoothly bent loop and of a chain of straight links, the 
more so the closer the a/ (3 ratio gets to 0. The sections 
where the rod is facing the main bending direction with 
d 2 play the role of the "soft joints" where most of the 
bending accumulates. 

It is this concentration of the curvature in the "soft 
joints" that results in the decrease in the elastic energy 
of the rod. The energy density oscillates together with 
the curvature, reaching maxima at the points of the maxi- 
mum bend (Fig.^J). Since the bending becomes cheaper, 
some of the rod twist gets redistributed into the bend 
and the absolute value of the average twist u> decreased 
for both U and O solutions (Fig. [HJ|, to — 1.3°/bp and 
1.3°/bp, respectively. 

Locally, the twist u> of the anisotropically flexible rod 




FIG. 8: Changes in the predicted structure of the elastic loop 
due to the bending anisotropy. (a) The underwound solution, 
(b) The overwound solution. Structures obtained for a — 
/3 = 2/3 are shown in light; structures obtained for a — 4/15, 
f3 = 16/15 are shown in dark. 



also displays oscillations (Fig. HJ) . When all the bending 
occurs towards di, the twist slightly increases, as to wind 
the "rigid face" away from the main bending direction. 
When the bending occurs towards d 2 , the twist slightly 
decreases as to prolong the exposure of the "soft face" 
to the main bending. These oscillations of the twist are 
however not large, because they inflict a certain energetic 
penalty as well. 

Finally, the force of the protein-DNA interaction 
dropped to 7.9 pN for the U solution and to 7.2 pN for 
the O solution. The direction of the force in the U solu- 
tion changed insignificantly; the force in the O solution 
rotated "upwards" by about 40 degrees. 



C. Changes to the 76 bp lac repressor loops in the 
broad range of parameters fi and A/C 

From the described specific case of /i = 1/4 and 
A/C = 2/3, we proceeded to studying the elastic rod 
conformations in a broad range of parameters A/C and 
/i (or, a and (3). A/C was varied between 1/20 and 20, 
and \x between 10 -2 and 10 2 . In principle, such range is 
too broad as neither the DNA rigidity A/C significantly 
deviates from the range of 1, even by alternative esti- 
mates 0, H3, |3(| , nor is fj, likely to change from 1 by 
two orders of magnitude as the oscillations of DNA roll 
and tilt are normally of the same order [38l l5l| . The val- 
ues of /i > 1 are especially unlikely as the DNA bending 
towards the grooves should clearly take less energy than 
bending towards the backbone (cf Fig. |2J). Yet, the com- 
putations using the developped method were inexpensive, 
the data could be easily generated, and the broad range 
of parameters was studied for the sake of generality of our 
results in regard to the behavior of twisted elastic rods. 

The parameters were changed in two "sweeping" iter- 
ation cycles, which started from the isotropic solutions 
with a — (3 — 2/3 and went all over the studied range. 
A/C was changed in the first cycle, and /i - in the sec- 
ond, nested cycle, a and (3 were computed accordingly 
on each step, and the new solutions were generated. The 
results of the computations are presented in Figs. |5J EH 
for the U and O solutions, respectively. 

Expectedly, the energy U of the elastic rod grows when 
the bending rigidity A/C is growing. As the relative 
energetic cost of the bending and twisting changes, the 
elastic rod changes its shape as to optimally distribute 
the deformation between bending and twisting. For ex- 
ample, when A/C is growing, the rod becomes less bent 
and more twisted on the average, the loop straightens 
itself up and the average unwinding/overwinding to in- 
creases (Figs. 03 EH- Conversely, when A/C goes down, 
the costly twisting falls to zero and the rod centerline 
becomes more significantly bent at every point. Yet, 
the rod can not straighten itself to a line, nor can |u>| 
fall below zero - therefore, at some point the structure 
of the rod approaches an asymptotic shape, the average 
unwinding/overwinding and the r.m.s.d. from the initial 
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FIG. 9: Energy and geometry of the 76 bp U solution in the broad range of elastic moduli a and /3. The plots are made in 
coordinates fi = a/fl and A/C ~ (a + f$)/2 1491 1. Top left quadrant: 3D plots for the elastic energy U, the average unwinding 
uj of the loop, and the r.m.s.d. of the loop centerline from that in the isotropic case /i = 1. To present the best view, the 
orientaion of the plots for r.m.s.d. and uo are shown from a different viewpoint than U . The dark contour on the energy map 
shows the cross-section of the map at 20 kT, which is the DNA looping energy estimated from experiment [Uj. The projection 
of the 20 kT contour on the /i - A/C plane is shown to the right of the energy map. Top right quadrant: cross-sections of the 
maps for fi = 10 -2 , 1, and 10 2 (left to right). Bottom left quadrant: cross-sections of the maps for A/C = 1/20, 2/3, and 20 
(top to bottom). Bottom right quadrant: snapshots of the loop structures arranged according to their fi values (left to right) 
and A/C values (top to bottom). 
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FIG. 10: Energy and geometry of the 76 bp O solution in the broad range of elastic moduli a and f3. The plots are made in 
coordinates fj, = a/fl and A/C « (a + /3)/2 1491 1. Top left quadrant: 3D plots for the elastic energy U, the average unwinding 
uj of the loop, and the r.m.s.d. of the loop centerline from that in the isotropic case y, = 1. To present the best view, the 
orientation of the plot for r.m.s.d. is shown from a different viewpoint than U and uj. The dark contour on the energy map 
shows the cross-section of the map at 20 kT, which is the DNA looping energy estimated from experiment [Uj. The projection 
of the 20 kT contour on the /i - A/C plane is shown to the right of the energy map. Top right quadrant: cross-sections of the 
maps for fi = 10 -2 , 1, and 10 2 (left to right). Bottom left quadrant: cross-sections of the maps for A/C = 1/20, 2/3, and 20 
(top to bottom). Bottom right quadrant: snapshots of the loop structures arranged according to their fi values (left to right) 
and A/C values (top to bottom). 
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FIG. 11: (a) 3D plot of the elastic energy difference Af7 = 
Uxj — Uo between U and solutions, in the coordinates p = 
a /(3 and A/C ~ (a + f3)/2 fS5|. (b) Contours of the plot in 
(a) for the AU values of 1, 2, . . . , 7kT. 



structure become constant, and the energy becomes sim- 
ply a linear function of the bending modulus (cf. the 
plots in Figs. El ^| for fi = 1). A similar effect has been 
observed in the studies of the bending anisotropy of a 
Mobius band 

As in the discussed specific case of fi = 1/4, intro- 
ducing the anisotropic flexibility results in a significant 
drop of the elastic energy, by as much as an order of 
magnitude. The bending concentrates in the regions 
where the loop faces the main bending direction with 
its "soft" side, and the twist develops oscillations. The 
average twist normally decreases in its absolute value, 
as described for fi = 1/4, except when the A/C rigid- 
ity becomes really large. Then it is energetically costly 
to increase bending even in the soft spots, and the ab- 
solute value of the average twist at first increases (see 
the ui plot for A/C = 20 in Fig.EJ. Yet, the further in- 
crease in the bending anisotropy further reduces a (or f3, 
if fi — > oo) and makes bending in the "soft spots" cheaper 
than twisting, so the absolute average twist eventually 
turns down after the initial increase. As the twist re- 
duces to zero, the structure of the rod eventually reaches 
an asymptotic shape, as in the case of the changed bend- 
ing rigidity. The changes in structure and energy for 
fi < 1 and fi > 1 are nearly symmetric, because in the 
latter case the "soft spots" simply move along the loop 
centerline by a quarter of a period of the DNA helix - 
which is a mere 1/30-th of the total length of the loop. 

The cut through the 3D plot for the elastic energy 
at U = 20 kT shows that the experimentally estimated 
looping energy of 20 kT can be achieved in a wide family 
of A/C and fi - or, a and (3 - parameters. The parameter 
families for the U and O solutions are shown as contours 
on the 11- A/C plane in Figs. ED For the standard 
value of A/C = 2/3, the experimental energy is achieved 
for /i « 1/5 (or, /i w 5). Yet, it is evident that the model 
of anisotropically flexible twisted DNA can match the 
bending energy data with different sets of the bending 
moduli - so, more data from different experiments are 
needed in order to conclusively determine what specific 
moduli should be used. 

It should also be noted here that the approximation 
A/C ~ (a + (3)/2 was derived with the assumption that 



the bending of the rod is smooth enough compared to the 
intrinsic twist. The increase in the bending anisotropy, 
however, increases the curvature in the soft spots and 
eventually our assumption breaks down, i.e., the average 
(a + f3) / 2 does not correctly describe the rod rigidity 
any more for the strongly anisotropic case. This adds 
uncertainty to the choice of bending moduli a and (3 and 
necessitates studying the parameters in the broad range. 

Yet, certain conclusions about the lac repressor loops 
can be made even without certainty about the values of 
a and (3. The difference in the energy between the U 
and O loops, shown in Fig. ^2 exceeds 1 kT for most 
of the studied range of a and (3, and in a significant 
part of the range even goes above 3kT. Therefore, it can 
be safely concluded that, under thermal equilibrium, the 
DNA loop formed by the lac repressor should preferably 
have the shape of the U solution. Incidentally, the shape 
of this loop changes less with the change in a and (3 
(cf Figs. |51 110|) - and therefore can with more certainty 
be used to determine such large-scale parameters of the 
loop as, for example, the radius of gyration or an average 
protein-DNA distance. 

Interestingly, as the rigidity A/C is increased, the O 
and U loops converge to the same asymptotic shape (cf 
snapshots for A/C = 20, (i = 1 in Figs.EfllTU)) - the shape 
that has the least possible bending. The only difference 
is in the twist - the total twist of the O solution is 2 it 
plus that of the U solution - and in this case the whole 
energy difference of about 7 kT comes from the difference 
in the twisting energy. 

Another notable feature of the O solution is a near 
self-contact, which occurs when, soon after leaving the 
s = end, the loop passes close to its other end and 
the attached DNA segment (Figs. H El ED- The contact 
gets closer as the bending anisotropy increases or A/C 
decreases fFig. 110(1 . If electrostatics were taken into ac- 
count at this moment, this near self-contact would inflict 
a strong energetic penalty due to the strong self-repulsion 
of the negatively charged DNA. This happens indeed, as 
it will be demonstrated in the next section, and the open 
shape and the absence of any self-contact becomes an ad- 
ditional argument in favor of selecting the U solution as 
our prediction of the shape of the real DNA loop formed 
by the real lac repressor. 



D. Bending anisotropy effects for the 385 bp lac 
repressor loops 

The described effects of bending anisotropy were sim- 
ilarly observed in the case of the 385 bp loops. Bending 
concentrated in the "soft joints" and the solutions devel- 
oped oscillations of curvature, twist, and energy density. 
For (i = 1/4, the solutions became more bent and less 
twisted on the average, as the data in Table [H] show (cf. 
Table HJ . The U solution was the one to undergo the 
least change from its isotropic shape, while the solutions 
O and O'were those that changed the most. 
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TABLE II: Geometric and energetic properties of the four 
solutions of the BVP problem for the 385 bp-long DNA loop, 
in the case of fi = 1/4 (a = 4/15, f3 = 16/15). R.m.s.d. s are 
computed with respect to the isotropically flexible structures 
with a = /3= 2/3. 



In the broad range of parameters a and /?, the four long 
loops showed the same tendencies as the two short loops. 
The bending anisotropy allowed for a significant reduc- 
tion in the elastic energies, making the loops effectively 
softer (more bendable). The shapes of the loops, after 
undergoing some transformation following the change in 
the bending anisotropy or in the loop rigidity, eventu- 
ally reached asymptotic states. The asymptotic states 
for the soft loops (those with the small A/C ratio and/or 
high fi) were strongly bent conformations with practically 
zero unwinding/overwinding, where the twisting energy 
was of the same order as the small bending energy. The 
asymptotic states for the rigid loops (those with the large 
A/C ratio and ^ on the order of 1) were the conforma- 
tions with the least possible bending for each given loop 
topology, where the twisting achieved the worst possible 
value so as to optimized the bending, yet the latter still 
accounted for most of the elastic energy. Same as in the 
case of the short loops, the shapes of the overwound and 
underwound solutions converged when the loop rigidity 
achieved particularly large values. 

The underwound U solution was again the one to have 
the least elastic energy among the four solutions through- 
out the whole studied range of a and fi (A/C and /i) 
values. The maps of the energy difference between U 
and the other three solutions are shown in Fig. ^] The 
energy of the O solution does not normally differ from 
that of the U solution by more than several kT, there- 
fore the O solution should contribute to a small extent to 
the thermodynamic ensemble of the loop shapes and the 
properties of the lac repressor-bound 385 bp DNA loop. 
In contrast, the energies of the U' and O' loops are con- 
sistently 2-2.5 times larger than the energy of the U loop 
- and the difference amounts to small kT values only for 
unlikely combinations of A/C and /Lt. Therefore, one can 
safely conclude that these two loops, even though unin- 
hibited by steric overlap with the lac repressor, are still 
extraneous solutions, as in the case of the 76 bp loop, 
and may be safely excluded from any computation in- 
volving the properties of the thermodynamic ensemble 
of the 385 bp loop conformations. 



(a) 




FIG. 12: Elastic energy difference between U and (a) O, 
(b) U', and (c) O' solutions for the 385 bp-long DNA loop. 
3D plots of the energy difference in the coordinates fi — a//3 
and A/C « (a + /3)/2 1491 are shown on the left, and contour 
maps of the 3D plots for the AU values of 0.5, 1, 1.5, 2, 2.5, 
and 3kT are shown on the right. 



V. ELECTROSTATIC EFFECTS 



The last, but perhaps, the most important extension of 
the classic theory that is included in our model, consists 
of "charging" the modeled DNA molecule. The phos- 
phate groups of a real DNA carry a substantial electric 
charge: — 20. 8e per helical turn, that significantly influ- 
ences the conformational properties of DNA [53|, |5J, |55| • 
The DNA experiences strong self-repulsion that stiffens 
the helix and increases the distance of separation at the 
points of near self-contact. Also, all electrostatically 
charged objects in the vicinity of a DNA - such as amino 
acids of an attached protein, or lipids of a nearby nuclear 
membrane - interact with the DNA charges and influence 
the DNA conformations. Below, we describe our model 
of the electrostatic properties of DNA and the effects of 
electrostatics on the conformation of the lac repressor 
DNA loops. 
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Changes to the equations of elasticity due to 
electrostatics 



The electrostatic interactions of DNA with itself and 
any surrounding charges are introduced in our theory 
through the body forces /: 



f(s) = a(s)E(r(s)) 



(50) 



where E is the electric field at the point r(s) and <r(s) is 
the density of DNA electric charge at the point s. The 
present simplified treatment places the DNA c harg es on 
the centerline, as it was done in other studies [2!|. Im- 
plications of a more realistic model will be discussed in 
Sec. El 

The charge density <r(s) is modeled by a smooth diffcr- 
cntiablc function with relatively sharp maxima between 
the DNA base pairs, where the phosphate charges should 
be located. The chosen (dimensionless) function: 



f(s) = ^QdnaSUI 



(itN dna s) 



(51) 



is somewhat arbitrary but specifics are unlikely to sig- 
nificantly influence the results of our computations, as 
will be discussed below. iV DNA denotes the number of 
base pairs in the modeled DNA loop (which is assumed 
to be starting and ending with a base pair) and Q D na 
denotes the total charge of that DNA loop. That charge 
is reduced from its regular value of 2e per base pair due 
to Manning counterion condensation around the phos- 
phates [53]: Qdna = 2 e xN DNA . In this work, we assume 
X = 0.25, which is an observed value for a broad range 
of salt concentrations [53j . 

The electric field E is composed of the field of external 
electric charges, not associated with the modeled elastic 
rod - whichever are included in the model - and from the 
field of the modeled DNA itself (Fig. ^3 . E is computed 
using the Debye screening formula: 



E(f(s)) = 



1 



eM-\r(s)-Ri\/X) 



4iree \^ \f{s)-Ri\ 
v exp(-|r(s) -ffoOIA) 

j 



\f(s) -f(sj)| 



where A = ih/^Jc^ is the radius of Debye screening in 
an aqueous solution of mono valent electrolyte of molar 
concentration c s at 25 °C [&|. The dielectric permittivity 
e is that of water: e — 80. 

The first term in eq. (|52|) represents the DNA inter- 
action with external charges located at the points 
Ri, the sum runs over all those charges. The second 
term represents the self-repulsion of the DNA loop, and 
that sum runs over all the maxima of the charge den- 
sity cr(s), where the DNA phosphates - charges of 2e\ 
- are located. This sum approximates the integral over 
the charged elastic rod; computing such integral would 




FIG. 13: Electrostatic interactions in the elastic rod prob- 
lem. The electric field E(s) is computed at each point s of 
the rod as the sum of the "external" field E ext , produced by 
the charges qi, not associated with the elastic rod, and the 
"internal" field E lnt , produced by the charges q) nt = 2e\, 
placed in the maxima of the charge density a(s) of the elastic 
rod 151H . Here, Afi(s) = r(s) — R, and Afj(s) = f(s) — r(sj); 
see eq. (1521 for detail. The area of the rod shown in light color 
lies within the cutoff Ss and is excluded from computations 
of the electric field at the point s. 



be more consistent with the chosen model. However, this 
approximation is rather accurate, as will be shown below, 
and is made in order to significantly reduce the amount 
of computations required to calculate the electric field. 

More importantly, certain phosphate charges are ex- 
cluded from the summation in the second term (hence 
the prime sign next to the sum). Those excluded are 
the charges that are located closer to the point s than a 
certain cutoff distance 6s (Fig. I13fl . The reason for in- 
troducing such cutoff is that the DNA elasticity has par- 
tially electrostatic origin, so that the energetic penalty 
for DNA bending and twisting, approximated by the 
elastic functional l|31(l . already includes the contribu- 
tion from electrostatic repulsion between the neighbor- 
ing DNA charges. It is debatable what "neighboring" 
implies here, i.e., how close should two DNA phosphates 
be in order to significantly contribute to DNA elasticity. 
In this work, the cutoff distance 6s is chosen to be equal 
to one step of the DNA helix (/i=36 A). This, on the one 
hand, is the size of the smallest structural unit of DNA, 
beyond which it does not make sense at all to use a con- 
tinuum model of the double helix - so, at the very least, 
the phosphate pairs within such unit should be excluded 
from the explicit electrostatic component. On the other 
hand, the forces of interaction between the phosphates, 
separated by more than that distance from each other, 
are already much smaller than the elastic force, as will 
be shown below. Thus, even though the chosen cutoff 6s 
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might be too small, the resulting concomitant stiffening 
of the DNA is negligible. For comparison, calculations 
with cutoff values Ss = 1.5 h and Ss = 2 h were also per- 
formed. 

Thus, the electric field E, computed using 1)52(1 . is 

substituted into J5UJ), and the resulting body forces / 
appear in Eqs. i|5rl)) . lj»T7|) . of the "grand" system, 
in place of the previously zeroed terms. "Unzeroing" 
those terms, however, is not the only change to the equa- 
tions. More importantly, these body force terms depend 
on the conformation of the entire elastic loop due to 
the self-repulsion term in 1)52(1 . This makes the previ- 
ously ordinary differential equations of elasticity integro- 
differential and therefore, requires a new algorithm for 
solving them. The solutions of the integro-differential 
equations minimize the new energy functional: 



(a) 



U — Udastic + Uq — UQ tB traight 



(53) 



where V elastic is the elastic energy computed as in (13111 . 
Uq is the electrostatic energy computed, in accordance 
with lO, as 



and UQ tS traight is the electrostatic "ground state" energy: 
computed using the same formula ((54|) for the straight 
DNA segment of the same length as the studied loop. 



B. Changes to the computational algorithm; 
results for the 76 bp lac repressor loops 
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To solve the integro-differential equations, the follow- 
ing algorithm is employed. As previously, the electro- 
static interactions are "turned on" during a separate it- 
eration cycle. At each step of the latter, the equations 
are solved for the electric field Ei — w E E, where the 
"electrostatic weight" w E grows linearly from to 1. 

However, each step of this iteration cycle becomes its 
own iterative sub-cycle. The electric held Ei is computed 
at the beginning of the sub-cycle and the equations are 
solved with this, constant held. Then the held is re- 
computed for the new conformation of the elastic rod, the 
equations are solved again for the new held, and so on 
until convergence of the rod to a permanent conformation 
(and, consequently, of the held to a permanent value). 
The weight w E is kept constant throughout the sub-cycle. 
To enforce convergence, the field used in each round of 
the sub-cycle is weight-averaged with that used in the 
previous round: 

Eij = w a E it j( actua i) + (1 - w a )E it j_i (55) 



FIG. 14: Changes in the predicted structure and energy of the 
76 bp DNA loops after electrostatic interactions are taken into 
account. Left column: the U solution; right column: the O 
solution. (a,b) Elastic (E), electrostatic (Q), and total (T) 
energy of the elastic loop vs. the weight w of the electrostatic 
interactions. (c,d) Uncharged (w = 0, in light color) and 
completely charged (to = 1, in dark color) structures of the 
elastic loop. The bottom views are rotated by 70° around 
the vertical axis relative to the top views. (e,f) The r.m.s.d. 
between the charged and uncharged loop structures, in DNA 
helical steps h. The data in this figure correspond to a = 
4/15, f3 = 16/15, ionic strength of 10 mM and an electrostatics 
exclusion radius Ss — h. 



The averaging weight w a is selected by trial and error 
so as to speed up convergence. For the lac repressor 
system, the trivial choice of w a = 0.5 turned out to be 
satisfactory. 

This approach to solving the integro-differential equa- 
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tions assumes that the elastic rod conformation changes 
smoothly with the growth of the electric field. For intri- 
cate rod conformations, which might change in a com- 
plicated manner with the addition of even small electro- 
static forces, this approach may conceivably fail. Yet, it 
worked extremely well for the studied case of the DNA 
loop clamped by the lac repressor. 

The changes to the structure and energy of the 76 bp 
DNA loops due to electrostatic interactions were com- 
puted for a broad range of ionic strength c s (0, 10, 15, 
20, 25, 50, and 100 mM) and three different cutoff values 
Ss (h, 1.5 h, 2h). The computations were performed with 
A/C = 2/3 and the previously selected fi = 1/4 (result- 
ing in the elastic moduli of a = 4/15 and (5 = 16/15). 
The external charges included in the model were those as- 
sociated with the phosphates of the DNA segments from 
the crystal structure |4(| (see Fig. FTTH) . The iteration 
cycle was divided into 100 sub-cycles, which showed a 
remarkable convergence: the length of no sub-cycle ex- 
ceeded three iteration rounds. 

The changes in the structure and energy of the elas- 
tic loops due to electrostatic interactions are presented 
in Fig. E| for the ionic strength of 10 mM and the ex- 
clusion radius of h. The structure of the U solution al- 
most does not change: the r.m.s.d. between the original 
(w E = 0) and the final (w E — 1) structures does not ex- 
ceed 1 h. Neither do the curvature and twist profiles of 
this loop significantly change (Fig. [BJ 3d column). The 
energy of the loop changes by the electrostatic contribu- 
tion of 6.1 kT, that - because the structure is not chang- 
ing - practically does not depend on w E (Fig. I14|) . This 
energy increase is mainly accounted for by the interaction 
of the loop termini with each other and the external DNA 
segments (Fig.|SJ). The self-repulsion accounts for 55% of 
the electrostatic energy; 45% comes from the interaction 
with the external DNA segments. The apparent reason 
for the absence of a large change in the geometry of the 
U loop lies in the fact that this loop is an open semicircu- 
lar structure, which at no point comes into close contact 
with itself or the external DNA. 

In contrast, the initial structure of the O loop is bent 
over so that the beginning of the loop almost touches 
the end of the loop and the attached DNA segment 
fFig. IS1 ITHd). As a result, the electrostatic interactions 
force a significant change in the structure and energy 
fFig. IT4lb. d. D. The structure opens up, the gap at the 
point of the near self-crossing increases, the r.m.s.d. be- 
tween the final and the original structures reaches 9 ft. 
(Fig.ll4lf): the DNA overwinding almost doubles (Fig. [HJ 
6th column) . This allows the electrostatic energy to drop 
from 13.2 kT to 8.0 kT, yet the elastic energy grows by 
1.7 kT (Fig. E|b); together, the energy reaches 36.3 kT 
so that the gap from the U loop increases from 3.3 kT 
to 7.4 kT. As in the case of the U loop, the main con- 
tribution to the electrostatic interactions comes from the 
loop ends (Fig. [fJJ ; the energy distribution between self- 
repulsion and the interactions with the external DNA 
charges is practically the same. 
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FIG. 15: The effect of ionic strength on the predicted struc- 
ture and energy of the 76 bp DNA loops. Left column: the U 
solution; right column: the O solution. (a,b) Elastic (E), elec- 
trostatic (Q), and total (T) energy of the elastic loop. Shown 
are only the plots for the exclusion radius of h; the energy 
plots for the other exclusion radii are almost indistinguish- 
able. (c,d) Snapshots of the elastic loop structures for 10 mM 
(dark), 25 mM (medium), and 100 mM (light color). Points 
where the snapshots were taken are shown as dots of the cor- 
responding colors on the axes of panels (a), (b), (e), (f). The 
bottom views are rotated by 70° around the vertical axis rela- 
tive to the top views. (e,f) The r.m.s.d. of the loop structures 
from those computed without electrostatics (equivalent to in- 
finitely high salt). The lines, from top to bottom, correspond 
to the exclusion radii of h, 1.5 h, and 2 h. 



Naturally, the calculated effect diminishes when the 
ionic strength of the solution increases and therefore the 
electrostatics becomes more strongly screened. Fig. 1151 
shows what happens to the structure and energy of the 
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FIG. 16: The effect of ionic strength on the energy and geometry of the 385 bp DNA loops. Elastic (E), electrostatic (Q), and 
total (T) energy (left panel) and r.m.s.d. from the uncharged structures (right panel) are plotted for the U, U', O, and O' loops 
vs the ionic strength c s for the exclusion radius of 3 h. The missing points on the U' and O' plots indicate solutions, missing 
due to the non- convergence of the iterative procedure (see text). 



U and O loops when the ionic strength changes in the 
range of 10 mM - 100 mM (which covers the range of 
physiological ionic strengths). The structure and elastic 
energy of the U loop again show almost no change, the 
total energy of that loop falls from 29.7 kT to 23.5 kT due 
to the drop in electrostatic energy. The structure of the 
O loop returns to almost what it was before the electro- 
statics was computed (within the r.m.s.d. of 2.2 h); the 
elastic energy of this loop drops back to 26.6 kT and the 
electrostatic energy - to mere 0.5 kT. This results show 
that theoretical estimates of the energy of a DNA loop 
formation in vivo need to employ as good an estimate of 
the ionic strength conditions as possible. 

The lac repressor loops were extensively used to ana- 
lyze all the assumptions and approximations of our model 
and showed that those were satisfactory indeed. The cal- 
culations were repeated for the self-repulsion cutoffs of 
6s = 1.5 h and 6s — 2 h. The resulting change in the loop 
energy at c s =10 mM equals only AUh-i.5h = 0.35 kT 
and A Uh-2 h = 0.75 kT for the U loop, and A {7^-1.5 h — 
0.35 kT and AU h -2h = 0.75 kT for the O loop; all these 
values drop to below 0.1 kT when the ionic strength rises 
to 100 mM. The difference is mainly in electrostatic en- 
ergy, and the elastic energy is always within 0.1 kT of 
that of the structures obtained with Ss = h. Accord- 
ingly, the r.m.s.d. from the uncharged structure varies 
by at most 0.1 h for the different cutoffs (Fig. I15le. fl. 
Therefore, even the "largest" cutoff 6s = 2 h is satisfac- 
tory for the electrostatic calculations - while at the same 
time increasing the speed of computations. 

An additional advantage of using larger cutoff is that 
the concomitant stiffening of the modeled DNA, which 
possibly takes place due to too many phosphate pairs in- 



cluded in the electrostatic interactions, is reduced. Such 
stiffening, however, is truly negligible: in all the cases, the 
electrostatic force does not exceed 1-2 pN per base pair, 
compared to the calculated elastic force in the range of 
10-20 pN. Changes in the cutoff results in only insignif- 
icant changes of the electrostatic force. Nor does eval- 
uating the electric field and energy using the sums i|52|) 
and l|54|) (instead of a more consistent integral over the 
loop centerline) result in any significant difference. Test 
calculations showed that in all the studied cases the two 
ways of evaluating the energy differ by at most 0.02%. 

Finally, it was tested in how far the particular 
choice d51fl for the charge density of DNA a(s) influences 
the computation results. The calculations were repeated 
for the constant charge density cr'(s) = Q DNA (in dimcn- 
sionless representation) . The energies of the loop confor- 
mation never changed by more than 10 -4 of their values 
in the whole range of c s and Ss; the r.m.s.d. between the 
loop conformations obtained with different cr(s) never ex- 
ceeded 0.01 h. Therefore, the electrostatic properties of 
the elastic rod in the current model can safely be com- 
puted with constant electrostatic density, further saving 
the computation time. 



C. Results for the 385 bp lac repressor loops 

The electrostatic computations were similarly per- 
formed for the 385 bp loops, in the same range of salt 
concentrations and for exclusion radii of 2 h and 3 h. For 
the U and O loops, the results were qualitatively the 
same as in the case of the short loops. The elastic loops 
became more open and straigtened up with respect to 
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the lac repressor; the energy of the loops increased by 0- 
6kT, depending on the salt concentration fFig.H6|). The 
U loop was again the one to change its structure and 
energy to the least extent. The results of the computa- 
tions using the different cutoff radii were practically the 
same; approximating the self-repulsion field by a discrete 
sum (|52H gave almost exact results; replacing the charge 
density function (|51|) with the constant function cr'(s) 
had no significant influence on the results. 

One difference from the short loop case consisted, 
though, in the more significant change of the long loop 
structures with respect to the uncharged loops. The 
r.m.s.d. s reached 10 h for the U loop and 25 h for the 
O loop (Fig. E| cf Fig. H51e. f). As previously, the ma- 
jor part of the electrostatic effect consisted in repulsion 
between the ends of the loop, brought closely together 
by the protein, and the protein-attached DNA segments. 
This repulsion tended to change the direction of the ends 
of the loop, bending them away from each other and the 
DNA segments. In the case of the short loops it was 
impossible to notably change the direction without sig- 
nificantly stressing the rest of the loop. Yet the long loops 
could more easily accomodate opening up at the ends and 
therefore changed their structures more significantly. 

The larger structural change necessitated longer calcu- 
lations. For the long loops, the iteration steps typically 
consisted of 5-6 iteration sub-cycles, and even of a few 
dozen sub-cycles at especially stiff steps. 

The U' and O' loops showed a similar responce to the 
electrostatics at high salt concentration (above 25 mM). 
Their electrostatic energy lied in the range 0-5 kT, and 
the structural change due to the increased bending of the 
loop ends amounted to up to 10 h r.m.s.d. from the un- 
charged structure for the U' solution and up to 15 h - for 
the O' solution fFig. H6[) . Yet, low salt and stronger elec- 
trostatics rendered the solutions instable. Electrostatic 
computations with no salt screening (0 mM) transformed 
the U' solution into the U solution and the O' solution 
- into the O solution. What made the solutions instable 
was apparently the ever increasing bending of the ends of 
the loops away from each other that, in combination with 
the bending anisotropy, also caused high twist oscillations 
near the loop ends (as described in Sec. UVBl) . The com- 
bination of twisting and bending caused the loops to flip 
up - as one can cause a piece of wire to flip up and down 
by twisting its ends between one's fingers. 

For the intermediate salt concentrations (10-20 mM) 
the oscillations of the intermediate solutions between the 
two possible states resulted in non-convergence of the it- 
erative procedure. In this respect, using the larger elec- 
trostatic exclusion radius improved convergence. For the 
exclusion radius of 2 h, the iterations did not converge 
for salt concentrations of 15 and 20 mM; convergence for 
the 10 mM salt was achieved but resulted again in flip- 
ping up to the stable solutions. For the exclusion radius 
of 3 h, the iterations successfully converged to U' and O' 
solutions (albeit somewhat changed due to the electro- 
statics) for the 15 and 20 mM salt concentrations and 



did not converge for 10 mM only. 

Such instability of the U' and O' loops serves, of course, 
as another argument for disregarding them in favor of the 
stable U and O solutions. 

Upon introducing the electrostatic self-repulsion, an 
interesting experiment could be performed. Self-crossing 
by the solutions during the iteration cycles - as described 
in Sec. IIII Bl - was no longer possible. Therefore, one 
could explore whether superhelical structures of the loop 
could be built by further twisting the ends of the loop. 
One extra turn around the cross-section at the s — 1 end 
did generate new structures of the U and O loops. Yet, 
those structures were so stressed and had such a high 
energy (on the order of 50 kT higher than their predeces- 
sors) that it was obvious that those structures will not 
play any part at all in the real thermodynamic ensemble 
ot the lac repressor loops. Any further twisting of the 
ends resulted in non-convergence of the iterative proce- 
dure. Clearly, this length of a DNA loop is insufficient 
to produce a rich spectrum of superhelical structures. 



VI. DISCUSSION 

Below, we will review the presented modeling method 
and its limitations, compare it with the previously re- 
ported similar models, discuss the possible applications 
and the further developments of the method, and summa- 
rize what has been learned about the lac repressor-DNA 
complex. 



A. Summary of the model 

The presented modeling method consists in approxi- 
mating DNA loops with electrostatically charged elas- 
tic rods and computing their equilibrium conformations 
by solving the modified Kirchhoff equations of elastic- 
ity. The solutions to these equations provide zero- 
temperature structures of DNA loops - the equilibrium 
points, around which the loops fluctuate at a finite tem- 
perature. From these solutions, one can automatically 
obtain both global and local structural parameters, such 
as the twist and curvature at every point of the loop, 
the loop's radius of gyration, the linking number of the 
loop and its distribution between writhe and twist, vari- 
ous protein-DNA distances, the distances between DNA 
sites of special interest, etc. The solutions readily pro- 
vide an estimate of the energy of the DNA loop and the 
forces that the protein has to muster in order to confine 
the loop termini to the given conformation, the distribu- 
tion of the energy between bending and twisting, and the 
profiles of stress and energy along the DNA loop. 

Our method allows to build a family of topologically 
different loop structures by applying such simple geomet- 
rical transformations as twisting and rotating the loop 
ends, or varying the initial loop conformation that serves 
as the starting point for solving the BVP. By comparing 
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the energies of the tope-logically different conformations 
and assuming the Boltzmann probability to find the real 
DNA loop in either of them, one can either deduce the 
lowest energy structure which the loop should predomi- 
nantly adopt - as was mostly the case in the studied lac 
repressor-DNA complex - or to compute the loop prop- 
erties of interest by taking Boltzmann averages among 
several obtained structures of comparable energies. 

For DNA loops of a size of several persistence lengths 
(150 bp [ii||5(j), only a few topologically different struc- 
tures of comparable energy can exist and our simplified 
search of the conformational space should be sufficient 
to discover all members of the topological ensemble - as 
it has been demonstrated in this work. Yet longer DNA 
loops should produce large topological families of struc- 
tures and, unless a fast exhaustive search procedure is 
discovered, this limits the applicability of our method to 
DNA loop on the order of or shorter than 1,000 bp. The 
method is still good for generating sample structures of 
longer loops, but more structures of comparable energy 
are likely to be missed. 

The lower boundary of applicability of our method is 
stipulated by the loop diameter: the Kirchhoff theory 
is applicable only if the elastic rod is much longer than 
its diameter. The diameter of the DNA double helix is 
2 nm, which is equivalent to 6 bp. Therefore, the loop 
studied by our method should be at least several times 
longer than that: roughly, 50 bp and longer. Ideally, 
one would request that the loop exceed the persistence 
length of DNA, yet proteins are known to bend DNA on 
smaller scales - as, for example, the lac repressor does - 
so we apply the theory to DNA loops of around 100 bp 
in length. 

Hence, we suggest that the presented model can be 
applied to studying the conformations of DNA loops of 
about 100 bp to 1,000 bp length. Building a single con- 
formation is a fast process that takes only several hours 
of computation on a single workstation. The problem 
can be solved for a certain set of boundary conditions, as 
presented here - or the loop boundaries can be systemat- 
ically moved and rotated and the dependence of the loop 
properties on the boundary conditions can be studied by 
re-solving the problem in each new case. Such approach 
has the advantage over the Monte Carlo simulations in 
avoiding having to build and analyze a massive set of 
structures sampling the conformational space. 

At a finite temperature, the DNA loops exist as an en- 
semble of conformations. While generating all feasible 
topologically different structures, our method still ne- 
glects the thermal vibrations of each of them and the 
related entropic effects. Yet, those effects are likely to be 
insignificant, since the length of the studied DNA loops is 
limited to several persistence lengths, as discussed above. 
The related thermal oscillations of the loop structure 
should be small, although a separate study to quantify 
the effect of the oscillations seems worthwhile. For longer 
DNA loops, an extensive sampling of the conformational 
space - for example, using the Monte Carlo approach - 



is indispensable. 



B. Advanced features and potential applications 

Compared to the pre-existing analytical and com- 
putational DNA models based on the theory of elas- 
ticity e m m m m m m m m, 0U r 

model provides a universal and flexible description of 
DNA properties and interactions. Most previous meth- 
ods either considered the DNA to be isotropically flexi- 
ble [H E |H IHIH H| HI E3 , or did not consider the 
effects of the DNA intrinsic twist and curvature [2{|, or 
limited the treatment to a purely elastic model, that is, to 
the cases when the electrostatic properties of DNA could 
be disregarded [l^. I20L |47| . In the present work, a model 
of anisotropically flexible, electrostatically charged DNA 
with intrinsic twist and intrinsic curvature has been em- 
ployed, Kirchhoff equations were derived in their most 
general form (|36I46() , and all these DNA properties have 
been extensively studied in the case of the lac repressor 
loops. As it has been shown in Sec. IIVI the combination 
of the intrinsic twist with the anisotropic flexibility is es- 
sential in order to correctly estimate the energy of the 
DNA loop, as well as the local bend and twist at each 
point of the loop. The electrostatic interactions are im- 
portant in the case of a close contact of the loop with 
itself or with other molecules (Sec. [Vj) . The universality 
of our approach allows us to include all these cases into 
the scope of approachable problems. 

Moreover, all the DNA parameters: the bending mod- 
uli, the intrinsic twist and curvature, the charge den- 
sity - are treated in our equations as functions of the 
arclength. This provides an automatic tool for study- 
ing the sequence-dependent properties of the DNA loops. 
The elastic moduli and intrinsic parameters need to be 
specified for different DNA sequences, e.g., as outlined 
in [23, IH, 1^3 ■ Then the properties of any loop are ap- 
proximated by functions a(s), 0(s), j(s), k\ 2 (s), w°(s), 
smoothly changing their values between those associated 
with each base pair in the loop sequence (as discussed 
in App. [DJ) and the influence of the DNA sequence on 
the conformation and energy of that loop can be stud- 
ied by simply substituting their functions associated with 
different DNA sequences into Eqs. (|36I46|) . Of course, ex- 
tensive testing of such parameters and comparison with 
experimental results on DNA bending and loop energet- 
ics has to be done prior to performing any such studies 
of sequence-dependent effects. 

Another opportunity that our model provides consists 
in its ability to mimic the effect of protein binding within 
DNA loops. For example, if a protein is known to bend 
DNA over a certain region 86] then the intrinsic curva- 
ture term can be adjusted so as to enforce the required 
curvature over that region of the DNA loop. To strictly 
enforce the curvature, the rigidity of that region would 
have to be increased as well. Studying the solutions to the 
Kirchhoff equations with such intrinsic curvature term 
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would reveal what changes to the energy and conforma- 
tion occur in the DNA l oop upon binding of the specified 
protein. A recent study of the CAP protein bind- 

ing within the lac repressor loop, performed in this vein, 
suggested an explanation for the experimentally observed 
cooperation between the CAP and the lac repressor in 
DNA binding HHJ. 

Certainly, the different features of our model have to be 
used only as each specific problems dictates. For exam- 
ple, the model of isotropically flexible DNA seems suffi- 
cient for determining the global properties of DNA loops, 
such as the linking numbers of different conformations, 
their radii of gyration, or the energy distribution between 
bend and twist. In the present study, increasing the com- 
plexity of the model and varying the elasticity properties 
did not much change the shape of the loop, which deter- 
mines these global properties (cf. Figs. ^ 171 101 ITU IT5j) . 
Presumably, the reduced model should be sufficient for 
computing the global properties of other DNA loops. 

If, however, the energy of the loop has to be estimated, 
or the energies of different loop conformations need to 
be compared, or, indeed, the local structure at a cer- 
tain point of the loop needs to be predicted, with a view 
to study how binding of a certain protein in that area 
changes upon the loop formation - then the anisotropic 
loop model becomes imperative, as it has been demon- 
strated in Sec. IIVI Finally, the problems which involve 
loop conformations with a near self-contact - such as 
those of the O loop or the tightly wound superhelical 
structures [2^, 0, - necessitate including the elec- 
trostatic force term into Kirchhoff equations. The same 
holds for the problems studying the effect of salt concen- 
tration on structure and energetics of DNA loops. 

The speed and high adaptability of our modeling ap- 
proach makes it a good candidate for multi-scale mod- 
eling simulations. The most obvious application would 
be to study the structure and dynamics of a system sim- 
ilar to the lac repressor-DNA complex: a protein or a 
protein aggregate holding a long DNA loop. The struc- 
ture of the loop and the forces and torques that the loop 
exerts on the protein are directly obtainable from the 
elastic rod computations. Such forces would normally 
consist of the elastic forces at the loop boundaries and, 
perhaps, the electrostatic forces if parts of the DNA loop 
approach the protein-DNA complex closely. The elec- 
trostatic forces are directly computable from the pre- 
dicted loop geometry. The forces and torques can then 
be plugged in an all-atom simulation of the structure of 
the protein aggregate and the DNA segments directly 
bound to it, similarly to how it is done in steered molec- 
ular dynamics simulations The state of the DNA 
segments that provide the boundaries of the DNA loop 
can be routinely checked during the all- atom simulations. 
Whenever a notable change in the positions and orienta- 
tions of the boundaries occur, the coarse-grained elastic 
computations have to be repeated (using the previous so- 
lutions as the initial guess) and the simulations continue 
with the updated values of the forces and torques. The 



fast coarse-grained computations presented here result in 
only a marginal increase in the total time required for the 
all-atom simulation. The multi-scale simulations provide 
a direct means to study the changes in structure and dy- 
namics of the the protein aggregate under the force of 
the DNA loop it creates. 

Another application of multi-scale modeling could be 
to a system where several proteins bind to the same 
DNA loop and alter the DNA geometry at or near their 
binding sites. Such systems arise during gene transcrip- 
tion, when multiple transcription factors and RNA poly- 
merase components bind at or near a gene promoter, 
during DNA replication when helicases and gyrases con- 
currently alter the DNA topology, inside eukaryotic chro- 
matin with multiple nucleosomes clustering on long DNA 
threads ,3j. Simulations of the separate protein-DNA 
complexes could be run on atomic scale, and the stress 
imposed on the DNA in one simulation could be passed to 
another via an elastic rod model of the DNA segment (s) 
connecting the different complexes. The binding of some 
proteins could even be mimicked with the intrinsic cur- 
vature and twist terms, as outlined above. 

Finally, the local DNA geometry predicted in our 
model can be used to replace the coarse-grained elastic 
ribbon with an all-atom DNA structure in selected areas 
of the loop or even over the whole loop (cf App. 0. An 
example is shown in Fig. [31c, d, where the all-atom struc- 
ture was placed on top of the predicted 76 bp and 385 bp 
U loops. A local segment of the all-atom structure can 
be simulated with the forces and torques from the rest 
of the loop applied to its boundaries - using the same 
multi-scale simulation technique as outlined above. The 
segment can be simulated on its own, so as to compare 
the segment's structural dynamics in an unconstrained 
conformation, or inside the large bent loop. Or, if the 
segment in question is a binding site of a certain protein, 
that protein could be docked to the DNA segment, and 
the simulation of the two would allow one to study the 
changes in the protein binding to DNA upon the loop 
formation. With the advent of the computational power, 
even the simulations of the all-atom structures of mod- 
erately sized DNA loops, such as the 76 bp loop stud- 
ied here, would become possible. For such a simulation, 
the all-atom loop structure, predicted on the basis of the 
elastic loop model, can serve as a good starting point. 



C. Further developments 

Naturally, before being used in such advanced simula- 
tions, the model has to be refined and extensively tested 
using all the available experimental data. In the present 
state, we have not even converged to a single preferred 
set of elastic moduli, as a whole family of values may 
account for the lac repressor loop bending energies. Re- 
fining and adjustment of the model parameters on the 
basis of other data is at this point imperative for mak- 
ing the suggested universal elastic model fully functional 
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on all levels. Such data can come from the experiments 
on DNA interaction with other DNA-binding proteins, 
including topoisomerases [HE IH3 , from the energetics of 
DNA minicircles , and from the analysis of defor- 

mations in DNA X-ray structures, following the ideas of 
Olson ei a/ 

Another necessary adjustment would have to account 
for the well-known sequence-specificity of DNA proper- 
ties j23, H3, E3 • The bending moduli have to be deter- 
mined in the sequence-specific fashion, as functions of the 
arclength s, varying according to the DNA sequence at 
each point of the studied loop (see Add. Id)| . The intrin- 
sic curvatures 2 and varying intrinsic twist w° are also 
known to have a significant effect on the structure and 
dynamics of some DNA sequences 62] and pa- 

rameterizing these functions is therefore also important 
for our model. 

Before the parameters are better defined, the model is 
still good for solving general problems, such as selection 
between alternative DNA loop topologies or energy es- 
timates within several kT, as discussed in the previous 
section, but not for more subtle quantitative predictions 
of the structural and energetical properties of DNA loops. 

Several extensions to the model may prove necessary 
in order to achieve realistic DNA descriptions in certain 
cases. One obvious extension is including DNA deforma- 
bility into Eqs. (|36I46|) . DNA is known to be a shearable 
and extensible molecule, as is evidenced by the deforma- 
tions observed in the all-atom structures |87j or microma- 
nipulation experiments 0, l37l l7Cj . The DNA deforma- 
bility can influence both the local and global structure 
of the modeled DNA, especially in view of the observed 
coupling between the DNA stretch and twist [7lJ. In 
order to include the DNA deformability in our model, 
Eqs. (|36I46I) have to be modified as outlined in App. [E] 
raising the order of the system I|36I46[I to 16. 

Another vital modification consists in adding a steric 
repulsion parameter to the other force terms. In the 
present study, there was no need for such a term be- 
cause of its insignificance compared to the DNA self- 
repulsion. However, if the DNA interaction with a posi- 
tively charged object (e.g., the histone core of the nucle- 
osome) is to be described, then the steric repulsion term 
becomes imperative lest the elastic solutions collapse on 
the positive charges, causing non-convergence of the it- 
erative process. The steric repulsion can be described 
as the van der Waals '6-12' potential and would not be 
qualitatively different from the electrostatic repulsion in- 
troduced in l|52l) . Only the electric charges in eq. l|o2l) 
have to be replaced by the van der Waals coefficients, the 
degrees of the denominators have to be changed to 6 or 
12, and the inverse screening radius has to be set to zero 
- otherwise, the forces of steric repulsion are computed in 
the same way and through the same iterative algorithm 
applied to solving the integro-differential equations. 

As for the electrostatic self-repulsion, its description 
in our model can be rendered more realistic by placing 
the phosphate charges on the outside of the double he- 



lix rather than on the centerline as in the present sim- 
plified treatment (|52|1 . Placing the phosphates at the 
points pidi(s) + ^2^2(5) of the rod cross-section, where 
pi and P2 are determined by the DNA chemical struc- 
ture (Fig. would result in replacing the radii r(s) in 
cq. © by Rp h (s) = f(s) + pxd\{s) + p 2 d 2 (s). That 
would make the electric field dependent not only on the 
radius- vector r(s), but also on the the Euler parameters 
qi-i-i(s) that determine the orientation of the local co- 
ordinate frame. This additional dependence is unlikely 
to result in any algorithmic difficulties with solving the 
equations. Physically, however, moving the phosphate 
charges away from the centerline means introducing ex- 
ternal torques in each cross-section in additional to exter- 
nal forces, and the torques could change the calculated 
loop structure - again, unlikely in the general case, but 
possibly in the case of a close approach of the charged 
elastic ribbon to itself or any other charged object in- 
volved in the model. 

As has been noted above, the discussed modeling 
method produces static, zero-temperature structures of 
DNA loops. Yet, the entropic contribution to the free 
energy of different DNA states may sometimes be im- 
portant. Interestingly, there is a way to estimate the 
structural entropy of a DNA loop with our method. One 
could employ the intrinsic curvature/twist terms and per- 
form statistical sampling by assigning bends and unwind- 
ing/overwinding with the energetic penalty between 
and 1 kT at random points of the loop, analyzing the 
resulting changes in the loop structure and energy. Since 
the zero temperature structure of the loop is used as the 
starting point in the iterative calculations of the ran- 
domly bent and twisted structures, the iterations should 
converge much faster than those running from scratch. 
Thus the ensemble of thermally excited structures can 
be generated and used to obtain the properties of the 
studied DNA loop at a final temperature. The entropy of 
each looped state can be estimated, for example, through 
the volume of space swept by all the different structures 
from the thermal ensemble. 

With the modifications, outlined above, the proposed 
model is likely to describe DNA loops very realistically, 
yet still be less detailed and computationally much faster 
than all-atom models. One drawback, however, lies in 
the non- linear nature of the elastic problems. The solu- 
tions to the equations of elasticity are known to exhibit 
a non-trivial dependence on the problem parameters, for 
example, the boundary conditions. A latent response to 
the change of a certain condition can lead to an abrupt 
change in the shape of the BVP solution, for example, if 
the point of instability of the latter is achieved. If the so- 
lution changes too abruptly, the iterative procedure will 
fail due to non-convergence of the BVP solver. Another 
reason for the non-convergence is meeting a bifurcation 
point, as it happened at low salt concentrations with the 
U' and O' solutions for the 385 bp loop fSec.lVCT). 

Such problems seem to be inherent to our method. 
If they are encountered, we would recommend to thor- 
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oughly analyze the nature of the non-convergence, for ex- 
ample, through monitoring the evolution of the interme- 
diate solutions prior to the non-convergence. In the case 
of abrupt changes, the ultimate structure can perhaps be 
guessed, or achieved along a different pathway with an 
equivalent endpoint that might not be leading the rod 
through the point of abrupt change (for instance, rotat- 
ing the end of the elastic rod counterclockwise by 2ir — <p 
if a clockwise rotation by 4> is causing problems). In the 
case of a bifurcation, the bifurcating solutions branches 
need to be analyzed - which by itself may provide useful 
insights into the structural properties and transforma- 
tions of the studied loop. 



D. Lac repressor loops 

To conclude the paper, let us summarize what has been 
learned with our method about the specific system, the 
lac repressor and its DNA loop. For both possible lengths 
of the loop, it has been shown that the underwound loop 
structure pointing away from the lac repressor should 
be predominant under thermodynamic equilibrium con- 
ditions, unless other biomolecules interfere. The pre- 
dicted structure of the U loop depends only slightly on 
the salt concentration, although the loop energy exhibits 
a stronger dependence. The experimentally observed en- 
ergy of the loop can be obtained with the right com- 
bination of parameters - which, of course, have to be 
extensively tested with other protein-DNA systems. 

Another reason why the predicted structure of the 
DNA loop has to be treated with caution is the dy- 
namic nature of the lac repressor-DNA complex. The X- 
ray structure |40| . on which our conclusions were based, 
has been obtained without the DNA loop connecting the 
protein-bound DNA pieces and, therefore, can be pre- 
sumably changed by the stress of the bent DNA loop. 
Yet, our results pave the way for analyzing these changes. 
The forces of the protein-DNA interactions, computed 
with our model, can be applied in a multi-scale simulation 
of the lac repressor-DNA complex, as described above. 
Such simulation would reveal the equilibrium state of the 
lac repressor with the bound DNA loop, or at least show 
the spectrum of structural states visited by the protein 
during its dynamics with the attached loop. 

The results of such dynamics may even change our con- 
clusions about the structure and energetics of the DNA 
loops created by the lac repressor. When the boundary 
conditions change, the O loop can become energetically 
preferable to the U loop, or even one of the dismissed 
extraneous solutions may become feasible. Yet, to make 
multi-scale simulations possible, that would study the 
complexes between lac repressor, or other proteins, and 
the DNA loops in their dynamic nature, was the driv- 
ing force behind the development of our coarse-grained 
DNA model - which, per se, does not pretend to yield 
final conclusions about the lac repressor-DNA complex. 



VII. CONCLUSION 

In conclusion, a universal theoretical model of DNA 
loops has been presented. The model unifies several exist- 
ing DNA models and provides description of many phys- 
ical properties of real DNA: anisotropic flexibility, salt- 
dependent electrostatic self-repulsion, intrinsic twist and 
curvature - all the properties being sequence-dependent. 
The model is applicable to a broad range of problems 
regarding the interactions of DNA-binding proteins with 
DNA segments of moderate length (100-1,000 bp) and 
can serve as a basis of all-atom or multi-scale simula- 
tions of protein-DNA complexes. The application of the 
model to the lac repressor-DNA complex revealed a likely 
structure of the 76 bp and 385 bp DNA loops, created by 
the protein. The experimentally measured energy of the 
DNA loop formation is obtainable with a proper set of 
parameters. The obtained forces of the protein-DNA in- 
teractions can be used in a multi-scale simulation of the 
lac repressor-DNA complex. Further comparison with 
experimental data will be beneficial for optimizing the 
parameters and approximations of the model. 
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APPENDIX A 

The following eleven dimcnsionlcss functions consti- 
tute the initial simplified solution to the equations l|36I 
I46J) . the planar circular uncharged isotropic rod: 
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(All) 



Here ip a is an arbitrary parameter that determines the 
angle between the x axis of the LCS and the plane of the 
elastic rod. 



26 



APPENDIX B 



we arrive 



at (IH . 



According to |4!l |46| . the equilibrium constant of bind- 
ing of the lac repressor to a single-operator DNA equals 
about KT 11 M for Oi and 10~ 9 M for 3 at high salt 
concentration (0.2 M). This results in the free energies 
of binding AG 0l = kTlogA^ = -25kT and AGo 3 = 
kTlogAo 3 = — 21kT. The equilibrium binding constant 
of the lac repressor to the DNA promoter, contai ning 
both Oi and 3 sites, equals 3.4 - 6.2 x 1CT 12 M 
resulting in the free energy AGoi-o 3 ~ — 26kT. This 
results in the free energy of formation of the 76 bp DNA 
loop AG loop = AG Ql -o 3 - AG Ql - AGq 3 = 20kT. 



APPENDIX C 

In order to derive l|49|l . let us consider a short section 
of a tightly twisted rod with no intrinsic curvature. Then 
the total curvature equals K(s) = \J Ki(s) 2 + K2(s) 2 = 

k\(s) 2 + k 2 (s) 2 . The bending energy of the section is: 



2 '" L ' 2 



U K = I I ~7r~^i + -^-4 I ds 



M+M ' K*ds 



Al-A 2 f S2 r,2 



K A cos 2Tds , 
(CI) 

where T(s) is the angle between the binormal b and d\ 
(Fig. Q]b)- A simple textbook calculation using Frenet 
formulae j3^| yields 



r(s) = / (t(s') - fl(s'))ds' 



{T( S ')-Lu{s')-UJ (s'))ds' 



S < T > s — S < U> > s — S < U) > s , 



(C2) 



APPENDIX D 

The sequence-dependent parameters could be related 
to the arclength of the rod in several possible ways. The 
simplest would be a step- wise assignment of the parame- 
ters. At the points, corresponding to each DNA step be- 
tween two neighboring base pairs, the parameters (elas- 
tic moduli a, (3, and 7, intrinsic curvature k° 2 and twist 
u)°) would adopt the values correspondent to that DNA 
step. Connecting those points with smooth functions (for 
example, spline-based ones) would result in the desired 
parameter setup a(s), /3(s), etc., for the particular loop. 

A slightly modified approach could keep the DNA 
step-based parameters constant in a certain area in the 
middle between the base pairs and limit the zone of a 
smooth transition from value to value to a certain width 
S. Then, the sequence-based parameter functions be- 
tween the DNA steps to and from the base pair located 
at the point Si would look like: 

a(s) = Ui , if Si — h/2 < s < Sj — 5/2 

a i+ i + at a i+1 - a. t . tt 
"(•) = 2 + ~ 2 sm -(s - st) , (D1) 

if s t - 6/2 < s < St + 5/2 
a(s) — cii + i , if Si + 5/2 < s + h/2 < Si . 

Certainly, another smoothly differentiable function can 
be used in the transition zone Sj — 5/2 < s < Si + 5/2 
instead of the sine. 

Finally, a more complicated approach can relate the 
parameters to a longer DNA sequence surrounding each 
point s rather than to only three neighboring base pairs 
defining the two adjacent DNA steps. Such approach 
would be more realistic, as the experimental and sim- 
ulation data indicate pM l57j. In that case, the model 
parameter functions would have to depend on multiple 
base pairs neighboring each point s, for example, as de- 
scribed in Hal. 



where r(s) = — f(s) • b(s)/K(s) is the torsion of the cen- 
tcrline r(s) and the angular brackets < ... > s denote the 
average over the interval (s%, s). 

Since the rod is tightly twisted, the term ,s < lu° > s 
grows much faster with s than both other terms in (|C2|) 
and K(s). Therefore, the second term in (|C2|) is an in- 
tegral of a fast oscillating function and as such is much 
smaller than the first term. Accordingly, 



Ai + A 2 f 



K 2 ds 



(C3) 



Sl 



and, since the approximation of a single bending modulus 
assumes 

rs 2 A 



U K = 



-K 2 ds 



(C4) 



APPENDIX E 

The DNA deformability can be described in our model 
by additional three variables, combined into the shift vec- 
tor <f(s). Its components e\,2 are the amount of shear in 
the two principal directions, and the component 63 is the 
amount of extension along the normal d% |28t |29| . The 
vector of shift e and the elastic force iV are linearly re- 
lated to each other, similarly to the vector of curvature 
k and the torque M (cf eq. flU ): 



N(s) = Bieidi + B 2 e 2 d 2 + De 3 d 3 



(El) 



where B\^ 2 are the shear moduli in the two principal di- 
rections, and D is the extension modulus of DNA. Those 
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parameters would also have to be determined from and 
extensively verifie d ag ainst the experimental data, e.g. 
those presented in l5lj| . 

Thus introduced, deformability changes eq. Q into 

?=e + d 3 , (E2) 

propagating into eq. l(T4"jl. Finally, the systems (I2(J13(J|) . 
(|36I46|) become of 16-th rather than 13-th order, but 
could be similarly solved by the continuation method. 

APPENDIX F 

The following algorithm has been used in this work 
to build all-atom DNA structures on the basis of coarse- 
grained elastic rod solutions. First, the all-atom struc- 
tures of the base pairs, with the sugar phosphate back- 
bone groups attached, were built in the idealized B- 
conformations using Quanta |73[ . Second, the chosen 



elastic rod solution^was used to obtain local coordinate 
frames (di(s bP}i ), d 2 (s bp ^), c^(s bp ,i) ) from q l (s bp ,i) 
at each point s bPl i = ih/L corresponding to the loca- 
tion of each base pair of the loop along the centcrlinc 
of the DNA helix. Third, the all-atom base pairs were 
centered at the points r(s bp ^) and aligned with the co- 
ordinate frames (di(s& p ,i), d2(sbp,i), d3(s bp< i) ) as illus- 
trated in Fig. |21 Fourth, the sugar phosphate groups 
were connected with each other and the lac repressor- 
bound DNA segments by phosphodiester bonds. Fifth, 
two 50-step minimization rounds with X-PLOR |74j us- 
ing CHARM22 force field jT^ were performed in order 
to relieve bad interatomic contacts and chemical group 
conformations (bonds, angles, dihedral angles) resulting 
from this idealized placement, especially for the DNA 
backbone. The resulting all-atom structure, while still 
stressed at certain points and overidealized at others, 
presents a reasonable starting point for any all-atom or 
multi-scale simulations, as described in Sec. IVIBI 
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[81] The DNA segments were co-crystallized together with 
the protein for two reasons: (i) to reveal the structure of 
the DNA-binding domains of the lac repressor, which are 
partially unfolded when not binding a piece of DNA |40|: 
(ii) to reveal the structure of the protein-DNA interface. 

[82] In fact, the boundaries of the loop were placed on the 
third base pair from the end of each DNA segment. The 
two terminal base pairs were disregarded, because their 
structure was seriously disrupted, apparently, due to in- 
teractions with solvent. Moreover, the two terminal base 
pairs do not contact the lac repressor, so their structure 
and orientation could easily change within the continuous 
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with could be obtained by solving Eqs. 1361461 analyti- 
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terlines f(s) and r 1 (s) of those loops, defined as follows: 
r.m.s.d. = \f(s) — f*(s)| ds. Typically, the r.m.s.d. will 
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[86] So does a huge variety of DNA-binding proteins |65|. for [87] Namely, the deformations of rise, slide, and shift |2pt I3H 
example, CAP 0, TATA-binding protein |fipL or, in- EH EH- 

deed, the headgroup of the lac repressor itself |4C| . 



